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EVOLUTION OF WHOLE CELLS AND ORGANISMS BY RECURSIVE 
SEQUENCE RECOMBINATION 

5 

CROSS-REFERENCE TO RELATED APPLICATION 
This application is a condnuation-in part of 09/1 16,188. The subject 

application claims priority to this prior application, which is also incorporated by refe'ence in 

its entirety for all purposes. 

10 FIELD OF THE INVENTION 

The invention applies the technical field of molecular genetics to evolve the 

genomes of cells and organisms to acquire new and improved properties. 

BACKGROUND 

WO 98/31837 (PCT/US98/00852) provides pioneeririg technology for evolving 

IS the genome of whole cells and organisms. One of sldll will appreciate that the technology 
provided in WO 98/3 1837 is fundamental to the ability of one of skill rapidly to evolve cells 
and whole oi:gamsms. For example, the document teaches a variety of recursive methods of 
artificially recombining nudeic acids in vivo, including entire genomes, and ways of selecting 
resulting recombinant organisms. 

20 This ability to evolve genes artifically is of fundamental importance. For 

example, cells have a number of weU-established uses in molecular biology, medidne and 
industrial processes. For example, cells are commonly used as hosts for manipulating DNA in 
processes sudi as transformation and recombination. Cells are used for expres^on of 
recombiiuuit proteins racoded by DNA transformed/transfected or otherwise introduced into 

25 the cells. Some types of cells are used as progenitors for gen^Htion of transgenic animals and 
plants. Although all of these processes are now routine, prior to the technology provided by 
WO 98/31837, the genomes of the cells used in these processes had evolved little from the 
genomes of natural cells, and particulariy not toward acquisition of new or improved 
properties for use in the above processes. 

30 Additional methods of recursively recombining nucldc adds in vivo and 

sdecdng resulting recombinants would be of use. The present invention provides a number of 
new and valuable methods and compositions for whole and partial genome evolutioa 
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SUMMARY OF THE INVENTION 
In one aspect, the invention provides methods of evoivmg a cell to acquire a 

desired function. Such methods entail, e.g., introducing a library of DNA fragments into a 

plurality of cells, whereby at least one of the fragments undergoes recombination with a 

5 segment in the genome or an episome of the cells to produce modified cells. Optionally, these 

modified cdls are bred to increase the diversity of the resulting recombined cellular 

population. The modified ceUs^ or the recombined cellular population are then screened for 

modified or recombined cells that have evolved toward acquisition of the deared fiinction. 

DNA from the modified cells that have evolved toward the desired fiinction is then optionally 

10 recombined with a fiirther library of DNA firagments, at least one of which undergoes 

recombination with a segment in the genome or the episome of the modified cells to produce 
fiirther modified cells. The fiirther modified cells are then screened for further modified cells 
that have fiirther evolved toward acquisition of the desired fiinction. Steps of recombination 
and screening/selection are repeated as required until the fiirther modified cells have acquired 

15 the desired function. In one preferred embodiment, modified cells are recursively recombined 
to increase diversity of the cells prior to performing any selection steps on any resulting cells. 

In some methods, the library or fiirther library of DNA Augments is coated 
with recA protein to stimulate recombination with the segment of the genome. The library of 
fixigments is optionaUy denatured to produce single-stranded DNA, which are annealed to 

20 produce duplexes, some of which contain mismatches at points of variation in the fragments. 
Duplexes containing mismatches are optionally selected by afiSnity chromatograpl^ to 
immobilized MutS. 

Optionally, the desu-ed fimction is secretion of a protein, and the plurality of 
cells fiirther comprises a construct encoding the protein. The protem is optionally inactive 

25 uidess secreted, and furth^ modified cells are optionally selected for protein fimction. 
Optionally, the protdn is toxic to the plurality of cells, unless secreted. In this case, the 
modified or fiirther modified cells which evolve toward acquisition of the desired fimction are 
scremed by propagating the cells and recovering surviving cells. 

In some methods^ the d^red fiinction is enhanced recombination. In such 

30 methods, the library of fiiBgments sometunes comprises a cluster of genes collecdvely 
conferring recombination capacity. Scre^iing can be achieved using cells carrying a gene 
encoding a marker whose repression is prevented by a mutation removable by recombination. 
The cells are screened by their expresnon of the marker resulting icom removal of the 
mutation by recombination. 
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In some methnris^ the plurality nf rftik Are plant fiftlk atiH thp> H^qfrpH prnpprty \^ 

improved resistance to a chemical or microbe. The modified or further modified ceUs (or 
whole plants) are exposed to the chemical or microbe and modified or fiuther modified cells 
having evolved toward the acquisition of the desired function are selected by their edacity to 

S survive the e?q)osure. 

In some methods, the plurality of cells are embryonic cells of an animal, and the 
method further comprises propagating the transformed cells to transgenic animals. 

The plurality of cells can be a plurality of industrial microoi;ganisms that are 
enriched for microorganisms which are tolerant to desired process conditions (heat, light, 

10 radiation, selected pH, presence of detergents or other denaturants, presence of alcohols or 
other organic molecules, etc.). 

The invention further provides methods for performing in vivo recombination. 
At least first and second segments firom at least one gene are introduced into a cell, the 
segments differing firom each other in at least two nucleotides, wherd>y the segmrats 

IS recombine to produce a library of chimeric genes. A chimeric gene is selected fi^om the library 
having acquired a desired fimctioa 

The invention fiuther provides methods of predicting efiScacy of a drug in 
treating a viral infection. Such methods entail recombining a nucldc add segment fi'om a 
virus, whose infection is inhibited by a drug, mth at least a second nucldc add segmrat torn 

20 the wus, the second nuddc acid segment differing fix)m the first nucleic acid segment in at 
least two nudeotides, to produce a library of recombinant nucldc acid segments. Host cells 
are then contacted with a collection of viruses having genomes including the recombmant 
nudeic add s^ments in a media containing the drug, and progeny viruses resulting fix>m 
infection of the host cdls are collected. 

25 A recombinant DNA segment from a first progeny virus recombines with at 

least a recombiiuint DNA segm«t Scorn a second prog^ virus to produce a further library of 
recombinant nuddc add segments. Host cells are contacted with a collection of viruses 
having genomes including the further library or recombinant nudeic add segments, in media 
containing the drug, and fiuther progeny viruses are produced by the host cells. The 

30 recombination and selection steps are repeated, as desired, until a fiuther progeny virus has 
acquired a desired degree of resistance to the drug, whereby the degree of resistance acquired 
and the number of repetitions needed to acquire it provide a measure of the efficacy of the 
drug in treating the virus. Viruses are optionally adapted to grow on particular cell lines. 
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The invention further pr o vi d e s methods of predicting eflScacy of a drug in ^ 

treating an infection by a pathogenic microorganism. These methods entail delivering a libraiy 
of DNA fragments into a plurality of microorganism cells, at least some of which undergo 
recombination with segments in the genome of the cells to produce modified miwoorganism 
5 cells. Modified microorganisms are propagated in a media containing the drug, and surviving 
microorganisms are recovered. DNA fi-om surviving microorganisms is recombined with a 
fiirther library of DNA fi-agments at least some of which undergo recombination with cognate 
segments in the DNA fi-om the surviving microorganisms to produce fijrther modified 
microorganisms cells. Further modified microoi^anisms are propagated in media containing 

10 the drug, and fiirther surviving microorganisms are collected. The recombination and 

selection steps are repeated as needed, until a fiirther surviving microorganism has acquired a 
desired degree of resistance to the drug. The degree of resistance acquired and the number of 
repetitions needed to acquire it provide a measure of the efficacy of the drug in IdUing the 
pathogenic microorganism. 

1 5 The invention fiirther provides methods of evolving a cell to acquire a desired 

fimction. These methods entail providing a populating of different cells. TheceDsaie 
cultured under conditions whereby DNA is exchanged between cells, forming cells with hybrid 
genomes. The cells are then screened or selected for cdls that have evoWed toward 
acquisition of a desired property. The DNA exchange and screening/selecting stq)S are 

20 repeated, as needed, with the screened/selected cells fi-om one c^cle forming the population of 
difiPerent cells in the next cycle, until a cell has acquired the desired property. 

Mechanisms of DNA exchange include coiqugation, phage-mediated 
transduction, liposome delivery, protoplast fiision, and sexual recombination of the cells. 
Optionally, a library of DNA fixigments can be transformed or electroporated into the cells. 

25 As noted, some methods of evolving a cell to acquire a desired prop^ are 

e£feGted by protoplast-mediated exchange of DNA between cells. Such methods entail 
forming protoplasts of a population of differoit cells. The protoplasts are then fiised to form 
hybrid protoplasts, m which genomes &om the protoplasts recombine to form hybrid genomes. 
The hybrid protoplasts are incubated under conditions promoting regenoution of cells. The 

30 regenerated cells can be recoinbined one or more times (i.e., via protoplasting or any other 
method than combines genomes of cdls) to increase the diversity of any resuhmg cells. 
Preferably, regenerated cells are reconibined several times, e.g., by protoplast fiidon to 
gmerate a diverse population of cdls. 
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The next step is t o s elect or 5 crei?n to i sol ate regener at ed c ells that have — 

evolved toward acquisition of the desh^d property. DNA exchange and selection/screening 
steps are repeated, as needed, with regenerated cells in one cycle being used to form 
protoplasts in the next cycle until the regenerated cells have acquaed the desired property. 
S Industrial ndcroorganisms are a preferred class oforganisms for conducting the above 
methods. Some methods further comprise a step of selecting or screening for fused 
protoplasts free from unfused protoplasts of parental cells. Some methods further comprise a 
step of selecting or screening for fused protoplasts with hybrid genomes free from cells with 
parental genomes. In some methods, protoplasts are provided by treating incUvidual cells, 
10 mycelia or spores with an enzyme that degrades cell walls. In some methods, the strain is a 
mutant that is lacking capacity for intact cell wall synthesis, and protoplasts form 
spontaneously. In some methods, protoplasts are formed by treating growing cells with an 
inhibitor of cell wall formation to generate protoplasts. 

In some methods, the desired property is expression and/or secntion of a 
IS protein or secondary metabolite, such as an industrial enzyme, a therapeutic protein, a primary 
metabolite such as lactic acid or ethanol, or a secondary metabolite such as erytfaromydn 
^dosporin A or taxol. In other methods it is the ability of the cell to convert compounds 
provided to the cell to different compounds. In yet oth^ methods, the desired prop^ is 
capacity for mdosis. In some methods, the desired property is compatibility to form a 
20 heterokaryon \^th another strain. 

The invention further provides methods of evolving a cell toward acquisition of 
a desired property. These methods entail providing a population of different cells. DNA is 
isolated &om a first subpopulation of the different cells and encapsulated in liposomes. 
Protoplasts are formed from a second subpopulation of the diffmnt cdls. Liposomes are 
25 fiised with the protoplasts, whereby DNA from the liposomes is taken up by the protoplasts 
and reconibuies with the genomes of the protoplasts. The protoplasts are incubated under 
regenerating conditions. Regenerating or regenerated cdls are then sdected or screened for 
evolution toward the desired property. 

The invention further provides methods of evolving a cdl toward acquisition of 
30 a desired property using artificial chromosomes. Such methods entail introdudng a DNA 
firagment library cloned into an artifidal chromosome into a population of cells. The cells are 
then cultured under conditions whereby sexual recombination occurs between the cells, and 
DNA fragments cloned into the artifidal chromosome recombines by homologous 
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recomhinatinn with cnrmspnnriing .segmfints of endngprnnns r.hmmosomes of the populatyns 

of cells, and endogenous chromosomes recombine with each other. Cells can also be 
recombined via conjugation. Any resulting cells can be recombined via any method noted 
herein, as many times as desired, to generate a desired level of diversity in the resulting 
5 recombinant cells. In any case, after generating a diverse library of cells, the cells that have 
evolved toward acquisition of the desired property are screened and/or selected for a desired 
property. The method is then repeated with cells that have evolved toward the desired 
property in one cycle forming the population of different cells in the next cycle. Here again, 
multiple cycles of in vivo recombination are optionally performed prior to any additional 

10 selection or screening steps. 

The invention further provides methods of evolving a DNA segment cloned 
into an artificial chromosome for acquisition of a desired property. These methods entail 
providing a library of variants of the segment, each variant cloned into separate copies of an 
artificial chromosome. The copies of the artificial chromosome are introduced into a 

15 population of cells. The cells are cultured under conditions whereby se>cual recombination 
occurs between cells and homologous recombination occurs between copies of the artificial 
diromosome bearing the variants. Variants are then screened or selected for evolution toward 
acquisition of the desired property. 

The invention fiirther provides hyperrecombinogenic recA proteins. Examples 

20 of such protdns are fifom clones 2, 4, S, 6 and 13 shown in Fig. 13. 

The method also provides methods of reiterative pooling and breeding of 
higher organisms. In the methods, a library of diverse multicettular organisms are produced 
(e.g., plants, animals or the like). A pool of male gametes is provided along with a pool of 
female gametes. At least one of the male pool or the female pool comprises a phirality of 

25 dififerent gametes derived fi-om different strains of a species or different species. The male 
gametes are used to fertilize the fonale gametes. At least a portion of the resd 
gametes grow into reproductively viable organisms. These reproductively viable organisms 
are crossed (e.g., by pairwise pooling and joining of the male and female gametes as before) to 
produce a library of diverse organisms. The library is then selected for a desired trait or 

30 property. 

The library of divme organisms can comprise a plurality of plants such as 
Gramineae, Fetucoideae, Poacoideae, Agrostis, Phleum, Dactylis, Sorgum, Setaria, Zea, 
Otyza, Triticum, Secale, Avena, Hordeum, Saccharum, Poa, Festuca, Stenotaphrum, 
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Cynodon, Coix, Olyreae, Phareae, Compositae or Leguminosae. For example, the plants can 
be e.g., corn, rice, wheat, rye, oats, barley, pea, beans, lentil, peanut, yam bean, cowpeas, 
velvet beans, soybean, clover, alfalfa, lupine, vetch, lotus, sweet clover, wisteria, sweetpea, 
sorghum, millet, sunflower, canola or the like. 
5 Similarly, the library of diverse organisms can incldue a plurality of animals 

such as non-human mammals, fish, insects, or the like. 

Optionally, a plurality of selected library members can be crossed by pooling 
gametes fi'om the selected members and repeatedly crossing any resulting additional 
reproductively viable organisms to produce a second library of diverse organisms (e.g., by split 
10 pairwise pooling and rejoining of the male and female gametes). Here again, the second 
libraiy can be selected for a desired trait or property, with the resulting selected members 
forming the basis for additional poolwise breeding and selection. 

A feature of the invention is the libraries made by these (or any preceding) 

method. 

1 5 BRIEF DESCRIPTION OF TBOE DRAWING 

Fig. 1, panels A-D: Scheme for in vitro shufiEling of genes. 

Fig. 2: Scheme for enriching for mismatched sequences using NfutS. 

Fig. 3: Alternative scheme for enriching for mismatched sequmces using 

Milts. 

20 Hg. 4: Scheme for evolving growth hormone genes to produce larger fish. 

Fig. S: Scheme for shuffling prokaryotes by protoplast fusion. 
Fig. 6: Scheme for mtioducing a sexmei c^cle into fungi previously incapable of 
sexual reproduction. 

Fig. 7: General scheme for shufflmg of fungi by protoplast flisioa 
25 Fig. 8: Shuffling fimgi by protoplast fiision with protoplasts generated by use 

of inhibitors of enzymes responsible for cell wall formation. 

Fig. 9: S huffling flingi by protoplast fusion using fiingal strains deficient in 
cell-wall synthesis that spontaneously form protoplasts. 

Fig. 10: YAC-mediated whole genome shuffling of Saccharomyces cerevisiae 
30 and related organisms. 

Fig. 1 1 : YAC-raediated shuffling of large DNA fragments. 
Fig. 12: (A, B, C and D) DNA sequences of a wildtype recA protein and five 
hyperrecombmogenic variants thereof 
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Fig. 1 3 : Amino add sequences of a wildtype recA protein and five 
hj^OTecombinogenic variants thereof. 

Fig. 14: illustration of combinatoriality. 

Fig. 15: Repeated painvise recombination to access multi-mutant progeny. 
Fig. 16: graph of fitness versus sequence space for three diflFerent mutation 

strategies. 

Fig. 17: graphs of asexual sequential mutagenesis and sexual recursive 

recombination. 

Fig. 18: Schematic for non-homologous recombination. 
Fig. 19: Schematic for split and pool strategy. 

Fig. 20, panel A: schematic for selectable/ counterselectable marker strategy. 
Fig. 20, panel B: schematic for sdectable/ counterselectable marker strategy for 

RecA. 

Fig. 21 : plant regeneration strategy for regenerating sak-tolerant plants. 
Fig. 22: Whole genome shujBBing of parsed (subcloned) genomes. 
Fig. 23: Schematic for blind doning of gene homologs. 
Fig. 24: ICgh throughput &mily shufiOing. 
Fig. 25 : Schematic and graph of poolwise recombination. 
Fig. 26: Schematic of protoplast fiisioa 
Fig. 27: Schematic assay for poolAvise recombination. 
Fig. 28: Schematic of halo assay and integrated system. 
Fig. 29: Schematic drawing illustrating recursive pooled breeding offish. 
Fig. 30: Schematic drawing illustrating recursive pooled breeding of plants. 
Fig. 3 1 : Schematic for shuflEling of 51 Colicolor 
Fig. 32: schematic drawing illustrating HTP actinorohodin assay. 
Fig. 33: sdiematic drawing and table illustrating whole gmome shuflBng of 
fidur parental strains. 

Fig. 34: schenwtic drawing of WGS through organized heteroduplex shuffling. 
DETAILED DESCRDPTION 

I. qENERAL 

A. THE BASIC APPROACH 

The invention provides methods for aitifidalfy evolving cells to acquire a new 
or inq)roved property by recursive sequence recombination. Briefly, recursive sequence 

8 
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recombination entails successive cycles of recombination to generate molecular diversity 
screeiung/selection to take advantage of that molecular diversity. That is, a family of nucleic 
acid molecules is created showing substantial sequrace and/or stmctural identity but dififering 
as to the presence of mutations. These sequences are then recombmed in any of the described 
5 formats so as to optimize the diversity of mutant combmations represented in the resulting 
recombined library. Typically, any resulting recombinant nucleic adds or genomes are 
recursively recombined for one or more cycles of recombination to increase the diversity of 
resulting products. After this recursive recombination procedure^ the final resulting products 
are scieened and/or selected for a desired trait or property. 
10 Alternatively, each recombination cycle can followed by at least one cycle of 

screening or selection for molecule having a desired characteristic. In this embodiment, the 
molecule(s) selected in one round form the starting materials for generating diversity in the 
next round. 

The cells to be evolved can be bacteria, archaebacteria, or eukaiyotic cells and 

1 5 can constitute a homogeneous cell line or mbced culture. Suitable cells for evolution include 
the bactmal and eukaiyotic cell lines commonly used in genetic engineering, protein 
expression, or the industrial production or conversion of protdns, enzymes, primary 
metabolites, secondary metabolites, fine, specialty or commodity chemicals. Suitable 
mamm al i an cells include those fi-om, e.g., mouse, rat, hamster, primate, and human, both cell 

20 lines and primary cultures. Such cells include stem cells, including embryonic stem cells and 
hemopoietic stem cells, 2^otes, fibroblasts, lymphocytes, Chinese hamster ovaiy (CHO), 
mouse fibroblasts (NIH3T3), kidney, liver, muscle, and skin cells. Other eukaiyotic ceQs of 
interest include plant cells, such as maize, rice, wheat, cotton, soybean, sugarcane, tobacco, 
and aiabidopsis; fish, algae, fiingi (penicillium, aspergillus, podospora, murospora, 

25 saccharomycesX "^©ct (e.g., baculo lepidoptera), yeast (picchia and saccharomyces^ 
Sctuzosaccharomyces pombe). Also of interest are many bactOTal cell types, both gram- 
negative and gram-positive, such as Bacillus subtilis, B. licehniformis, B, cereus, Escherichia 
coli, Streptamyces, Pseudomonas, Salmonella, Actinomycetes, Lactobacillius, 
Acetonitcbacter, Deinococcus, and Erwinia. The complete genome sequences of £. coh and 

30 Bacillus subtilis are described by Blattner et al.. Science 277, 1454-1462 (1997); Kunst et al.. 
Nature 390, 249-256 (1997)). 

Evohition commences by generating a population of variant cells. Typically, 
the cdls in the population are of the same type but rq)resent variants of a progaiitor cell. In 
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some instances, the variation is natural as when diflferent cells are obtained from diflferent 
individuals within a species, from different species or from different genera. In other 
instances, variation is induced by mutagenesis of a progenitor cell. Mutagenesis can be 
effected by subjecting the cell to mutagenic agents, or if the cell is a mutator cell (e.g., has 
mutations in genes involved in DNA replication, recombination and/or repair which favor 
introduction of mutations) simply by propagating the mutator cells. Mutator cells can be 
generated from successive selections for simple phenotypic changes (e.g., acquisition of 
rifen^jicin-resistance, then nalidbdc acid resistance then lac- to lac+ (see Mao et al., J. 
Bacteriol 179, 417-422 (1997)), or mutator cells can be genOTted by exposure to specific 
inhibitors of cellular factors that result in the mutator phenotype. These could be inlubitors of 
mi/rS, nmth, mufD, recD, nm/Y, muM, dam^ uvrD and the like. 

More generally, mutations are induced in cell populations using ai^ available 
mutation technique. Ck)nunon mechanisms for indudng mutations include, but are not limited 
to, the use of strains comprising mutations such as those involved in mismatch rq)air. e.g. 
mutations in mutS, mufT, muth and nrniH; exposace to UV li^t; Chemical mutagenesis, e.g. 
use of inhibitors of MMR, DNA damage inducible genes, or SOS inducers; oveiproduction/ 
underproduction/ mutation of any component of the homologous recombination 
complex/pathway, e.g. RecA, ssb, etc.; overproduction/ underproduction/ mutation of genes 
involved in DNA synthesis/homeostasis; overproduction/ underproduction/ mutation of 
recombmation-stimulating genes from bacteria, phage (e.g. Lambda Red fimction X or other 
organisms; addition of dii sites into/flanking the donor DNA fragments; coating the DNA 
fragments with RecA/ssb and the like. 

In other instances, variation is the result of transferring a library of DNA 
fragments into the cells (e.g., by conjugation, protoplast fusion, liposome fUsion, 
transformation, transduction or natural competence). At least one, and usually many of the 
fragments in the libraiy, show some, but not complete, sequence or structural identity with a 
cognate or allelic gene within the cells sufficient to allow homologous recombination to occur. 

For example, in one embodiment, homologous integration of a plasmid carrying a shuffled 
gene or metabolic pathway leads to insertion of the plasmid-bome sequences adjacent to the 
genomic copy. Optionally, a counter-selectable marker strategy is used to select for 
recombinants in which recombination occurred between the homologous sequences, leading to 

elimination of the counter-selectable marker. This strategy is illustrated m Fig. 20A A variety 
of selectable and counter selectable markers are amply illustrated in the art. For a list of usefiil 
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morkcra, sec. Berg and Be r g (1 99 6), Transposable element touls fui aiiuiubial ^ejietics. . 

Escherichia coli and Salmonella Neidhardt. Washington, D.C, ASM Preiss. 2: 2588-2612; La 
Rossa, ibid., 2527-2587. This strategy can be recursively repeated to maxinuze sequence 
divershy of targeted genes prior to screening/ selection for a desired trait or property. 
5 The library of fragments can derive from one or more sources. One source of 

fragments is a genomic library of fragments from a different spedes, cell type, organism or 
individual from the cells being transfected. In this situation, many of the fragments in the 
Ubrary have a cognate or allelic gene in the cells bdng transformed but differ from that gene 
due to the presence of naturally occurring species variation, polymorphisms, mutations, and 
1 0 the presence of nailtiple copies of some hoinologous genes in the genome. Alternatively, the 
library can be derived from DNA from the same ceQ type as is bdng transformed after that 
DNA has been subject to induced mutation, by conventional methods, such as radiation, error- 
prone PCR, grov^h in a mutator oi^ganism, transposon mutagenesis, or cassette mutagenesis. 
Alternatively, the library can derive from a genomic library of fragments generated bom the 
1 5 pooled genomic DNA of a population of cells having the desired characteristics. Alternatively, 
the library can derive from a genomic library of fragments generated from the pooled genomic 
, DNA of a population of cells having desired characteristics. 

In any of these situations, the genomic library can be a complete gmomic 
library or subgenontic library deriving, for example^ from a selected chromosome, or part of a 
20 chromosome or an episomal element ^thm a cell. As well as, or instead of these sources of 
DNA fragments, the Ubrary can contain fragments representing natural or selected variants of 
' selected gmes of known function (i.e., focused libraries). 

The numbo' of fi-agments in a library can vary from a single fragment to about 
10*^ with libraries having from 10^ to lO' fragments being common. The fragments should be 
25 sufiBdentiy long Uiat they can undergo homologous recombination and sufiBcientiy short that 
they can be introduced into a cell, and if nec^»ary, manipulated before introduction. 
Fragment sizes can range from about 10 b to about 20mb. Fragments can be double- or 
sin^e-stranded. 

The fragments can be mtroduced into cells as whole genomes or as components 
30 of viruses, plasmids, YACS, HACs or BACs or can be introduced as they are, in which case 
all or most of the fragments lack an origin of replication. Use of viral fragments with single- 
stranded genomes offer the advantage of delivering fragments in single stranded form, which 
promotes recombinatioa The fragments can also be joined to a selective marker before 
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introduotion. Inoluoion of fragmcnta in a vector having an origin of r e plicaiiun afluids a <, — — 

longer period of time after introduction into the cell in which fragments can undergo 
recombination with a cognate gene before being degraded or selected against and lost from the 
cell, thereby increasing the proportion of cells with recombinant genomes. Optionally, the 
5 vector is a suicide vector capable of a longer existence than an isolated DNA fragment but not 
capable of permanent retention in the cell line. Such a vector can transiently expr^s a marker 
for a sufficient time to screen for or select a cell bearing the vector (e.g., because cells 
transduced by the vector are the target cell type to be screened in subsequent selection assays), 
but is then degraded or otherwise rendered incapable of expressing the marker. The use of 

10 such vectors can be advantageous in p^orming optional subsequent rounds of recombination 
to be discussed below. For example, some suidde vectors express a long-lived toxin ^ch is 
neutralized by a short-lived molecule expressed from the same vector. Expres^on of the toxin 
alone will not allow vector to be established. Jense & Gerdes, MoL Microbiol, 17, 205-210 
(1995); Bernard et al.. Gene 162, 159-160. Alternatively, a vector can be rradered suicidal by 

1 5 incorporation of a defective origin of replication (e.g. a temperature-sensitive ori^ of 

replication) or by omission of an origin of replication. Vectors can also be rendered suicidal 
by inclusion of negative selection markers, such as ura3 in yeast or sacB in many bacteria. 
These genes become toxic only in the presence of specific compounds. Such vectors can be 
selected to have a wide range of stabilities. A list of conditional replication defects for vectors 

20 which can be used, e.g., to rendar the vector rq)lication defective is found, e.g., in B^g and 
Berg (1996), 'Transposable element tools for microbial genetics" Eschmchia coli and 
Sahnonella Nddhardt Washington, D.C., ASM Press. 2: 2588-2612. Similariy, a list of 
Gounterselectable markers, generally applicable to vector selection is also found in Berg and 
Berg, id. See also, LaRossa (1996) ''Mutant selections luddng phy^ology, mhibitors, and 

25 genotypes" Escherichia coli and Salmonella F. C. Nddhardt. Washington, D.C., ASM Press. 
2: 2527-2587. 

After introduction into cells, the fi-^ments can recombine with DNA present in 
the genome, or episomes of the cells by homologous, nonhomologous or site-spedfic 
recombination. For present purposes, homologous recombination makes the most significant 
30 contribution to evolution of the cells because this form of recombination amplifies the existing 
diversity between the DNA of the cells bdng transfected and the DNA firagments. For 
example, if a DNA fi^gmmt being transfected diflFers from a cognate or allelic gene at two 
positions, th^e are four possible recombination products, and each of these recombination 
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products can be formed in different cell s in the tranfifonned population. Thus, homologouo 

recombination of the fragment doubles the initial diversity in this gene. When many fragments 
recombine with corresponding cognate or allelic genes, the diversity of recombination 
products with respect to starting products increases exponentially with the number of 
S mutations. Recombination results in modified cells having modified genomes and/or episomes. 
Recursive recombination prior to selection further increases diversity of resulting modified 
cells. 

The variant cells, whether the result of natural variation, mutagenesis, or 
recombination are screened, or selected to identify a subset of cells that have evolved toward 

10 acquisition of a new or unproved property. The nature of the screen, of course, depends on 
the property and several examples will be discussed below. Typically, recombmation is 
repeated before initial screemng. Optionally, however, the screening can also be rq>eated 
before performing subsequent cycles of recombination. Stringency can be increased in 
repeated cycles of screemng. 

IS The subpopulation of cells surviving screening are optionally subjected to a 

fiirther round of recombinatioa In some instances, the fiirther round of recombination is 
effected by propagating the cells under conditions allowing exchange of DNA between cells. 
For example, protoplasts can be formed fix)m tiie cells, allowed to fiise, and regenerated. 
Cells with recombinant genomes are propagated from tiie fiised protoplasts. Alternatively, 

20 exchange of DNA can be promoted by propagation of cells or protoplasts in an electric field. 
For cells having a cotyugative transfer apparatus^ exchange of DNA can be promoted simply 
by propagating the cells. 

In other methods, the fiirther round of recombination is performed by a split 
and pool approach. That is, the surviving cells are divided into two pools. DNA is isolated 

25 fi'om one pool, and if necessary amplified, and then transfonned into the other pool 

Accordingly, DNA fragments finom tiie first pool constitute a fiirther libraiy of Segments and 
recombine with cognate fragmmts in tiie second pool resulting in fiirther diversity. An 
example of this strategy is illustrated in Fig. 19. As shown, a pool of mutant bacteria with 
improvements in a desired phenotype is obtained and split. Genes are obtained from one half, 

30 e.g., by PCR, by cloning of random genomic fragments, by infection with a transducing phage 
and harvesting transducing particles, or by the introduction of an origin of transfer (OriT) 
randomly into the relevant chromosome to create a donor population of cells capable of 
transferring random fragm^ by conjugation to an acceptor population. These graes are 
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thftn shnflfled (in vitr o by knoAvn methods or in vivo as taught herein), or simply oloncd into cm 

allele replacement vector (e.g., one carrying selectable and counter-selectable markers). The 
gene pool is then transformed into the other half of the original mutant pool and recombinants 
are selected and screened for further improvements in phenotype. These best variants are used 
5 as the starting point for the next cycle. Alternatively, recursive recombination by any of the 
methods noted can be performed prior to screening, thereby increasing the diversity of the 
population of cells to be screened. 

In other methods, some or all of the cells surviving screening are transfected 
with a fresh library of DNA fragments, which can be the same or different from the library 

10 used in the first round of recombination. In this situation, the genes in the fresh library 

undergo recombination with cognate genes in the surviving cells. If genes are introduced as 
components of a vector, compatibility of this vector with any vector used in a previous round 
of transfection should be considered. If the vector used in a previous round was a suicide 
vector, there is no problem of incompatibility. !£, however, the vector used in a previous 

1 5 round was not a suicide vector, a vector having a different incompatibility origin should be 
used in the subsequent round. In all of these formats, further recombination generates 
additional diversity in the DNA component of the cells resulting in fiirther modified cells. 

The further modified cells are subjected to another round of screening/selection 
according to the same prindples as the first round. Screemng/sdection identifies a 

20 subpopulation of fiirther modified cells that have further evolved toward acquisition of the 
property. This subpopulation of cells can be subjected to fiirther rounds of recombination and 
screening according to the same prindples, optionally with the stringent of screening being 
increased at each round. Eventually, cells are identified that have acquired the desired 
property. 

25 n. DEFINITTONS 

*The term cognate'refers to a gene sequence that is evolutionarily and 
fimctionally related between species. For exanq)le, in the human genome, the human CD4 
gene is the cognate gene to the mouse CD4 gene, since the sequences and structures of these 
two genes indicate that they are homologous and that both genes encode a protein which 
30 fimctions m signaling T-cell activation through MHC class Il-restricted antigen recognition. 

Screening is, in general, a two-step process in which one first detennines which 
cells do and do not express a screraing marker or phenotype (or a selected levd of marker or 
phmotype), and then phydcally separates the cells having the desired property. Selection is a 
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form of screening in which identification and physical separation are achieved simultaneously 
by expression of a selection marker, which, in some genetic circumstances, allows cells 
expressing the marker to survive while other cells die (or vice versa). Screening markers 
include luciferase, p-galactosidase, and green fluorescent protein. Selection markers include 
drug and toxin resistance genes. 

An exogenous DNA segment is one foreign (or heterologous) to the cell or 
homologous to the cell but in a position within the host cell nucleic acid in which the element 
is not ordinarily found. Exogenous DNA segments can be expressed to yield exogenous 
polypeptides. 

The term "gene" is used broadly to refer to any segment of DNA associated 
with a biological function. Thus, genes include coding sequences and/or the regulatory 
sequences required for their expression. Genes also include nonexpressed DNA segments 
that, for example, form recognition sequences for other protems. 

The terms ^Identical" or '^percent identity,'' in the context of two or more 
nucleic acids or polypeptide sequences, refer to two or more sequences or subsequences that 
are the same or have a specified percentage of amino acid residues or nucleotides that are the 
same, when compared and aligned for maximum correspondence, as measured using one of 
the following sequence comparison algorithms or by visual inspection. 

The phrase ''substantially identical," in the cont^ of two nucleic acids or 
polypeptides, refa^ to two or more sequences or subsequences that have at least 60%, 
preferably 80%, most preferably 90-95% nucleotide or amino add residue identity, when 
compared and aligned for maximum correspondence, as measured using one of the following 
sequence comparison algorithms or by visual inspection. Preferably, the substantial identity 
exists over a r^on of the sequraces that is at least about SO residues in length, more 
preferably over a region of at least about 100 residues, and most preferably the sequences are 
substantially identical over at least about ISO re^dues. In a most prefen^ embodiment, tiie 
sequences are substantially identical over the entire length of the coding regions. 

For sequence comparison, typically one sequence acts as a reference sequence, 
to which test sequences are compared. When using a sequence comparison algorithm, test and 
reference sequences are input into a computer, subsequence coordinates are designated, if 
necessaiy, and sequence algorithm program parameters are designated. The sequence 
comparison algorithm then calculates the percent sequence identity for the test sequence(s) 
relative to the r^ermce sequence, based on the designated program parameters. 
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Optimal alignment of sequences for comparison can be conducted, e.g.^ by, the 
local homology algorithm of Smith & Waterman, Adv. Appl Math 2:482 (1981), by the 
homology alignment algorithm of Needleman & Wunsch, J. Mol Biol 48:443 (1970), by the 
search for similarity method of Pearson & Lipman, Proc. Nat 7, Acad ScL USA 85:2444 
5 (1988), by computerized implementations of algorithms GAP, BESTFIT, FASTA, and 
TFASTA in the Wisconsin Genetics Software Package Release 7.0, Genetics Computer 
Group, 575 Science Dr., Madison, WI. 

Another exan^le of a useful alignment algorithm is PILEUP. PILEUP creates 
a multiple sequence alignment firom a group of related sequences using progressive, pairwise 

10 alignments to show relationship and percent sequence identity. It also plots a tree or 

dendogram shoxraig the clustering relationships used to create the alignment. PILEUP uses a 
simplification of the progressive alignment method of Feng & Doolittle, J. Mol Evol 35:351- 
360 (1987). The method used is ^milar to the method described by Higgins & Sharp, 
CABIOS 5: 151-153 (1989). The program can align up to 300 sequences, each of a maximum 

1 5 length of 5,000 nucleotides or amino acids. The multiple alignment procedure begins with the 
paiiwise alignment of the two most similar sequences, producing a cluster of two aligned 
sequences. This cluster is then aligned to the next most related sequence or cluster of aligned 
sequences. Two clusters of sequences are aligned by a simple extension of the pairwise 
alignment of two indi^ddual sequences. The final alignment is achieved by a series of 

20 progressive, pairwise alignments. The program is run by designating spedfic sequences and 
thdr amino add or nucleotide coordinates for re^ons of sequence comparison and by 
designating the program parameters. For example, a reference sequence can be compared to 
other test sequences to determine the percent sequence idratity relationship using the 
foUovnng parameters: default gap wdght (3.00), de&ult gap l^h wdght (0. 10), and 

25 weighted end gaps. 

Another example of algorithm that is suitable for determining percent sequence 
identity and sequence anularity is the BLAST algorithm, which is described m Altschul et al, 
J. Mol Biol. 21 5:403-410 (1990). Software for performing BLAST analyses is publicly 
available through the National Center for Biotechnology Information 

30 (http:/Avww,ncbi.nlm.nih.gov/). This algorithm involves first identifying high scoring 

sequence pairs (HSPs) by identifying short words of length W m the query sequence, which 
either match or satisfy some positive-valued threshold score T when aligned with a word of 
the same l^igth in a database sequence. T is referred to as the ndghborhood word score 
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threshold (Akschul et al, supra). These initial neighboriiood word hits act as seeds for ^ 
initiating searches to find longer HSPs containing them. The word hits are then extended in 
both directions along each sequence for as far as the cumulative alignment score can be 
increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters 
M (reward score for a pair of matching residues; always > 0) and N (penalty score for 
mismatching residues; always < 0). For amino acid sequences, a scoring matrix is used to 
calculate the cumulative score. Extension of the word hits in each direction are halted when: 
the cumulative alignment score falls off by the quantity X fi*om its maximum achieved value; 
the cumulative score goes to zero or below, due to the accumulation of one or more negative- 
scoring residue alignments; or the end of either sequence is reached. The BLAST algorithm 
parameters W, T, and X determine the sensitivity and speed of the aHgranent. The BLASTN 
program (for nucleotide sequences) uses as defaults a wordlength (W) of 11, an expectation 
(E) of 10, M-5, N=-4, and a comparison of both strands. For amino acid sequences, the 
BLAST? program uses as defaults a wordlength (W) of 3, an expectation (E) of 10, and the 
BLOSUM62 scoring matrix {see Henikoff & Hwiikofl^ Proc. NatL Acad. Sci. USA 89:10915 
(1989)). 

In ad^on to calculating percent sequence identity, the BLAST algorithm also 
performs a statistical analysis of the similaxity between two sequraces (see, e.g. , Kailin & 
Altschul, Proc. NatL Acad Sci. USA 90:5873-5787 (1993)), One measure of similarity 
pro'tdded by the BLAST algorithm is the smallest sum probability (P(N)X which provides an 
indication of the probability by which a match between two nucleotide or amino add 
sequences would occur by chance. For exanq)Ie, a nucleic acid is considered similar to a 
refeience sequence if the smallest sum probability in a comparison of the test nucleic acid to 
the reference nucldc add is less than about 0. 1, more preferably less than about 0.01, and 
most preferably less than about 0.001 . 

A fiirther indication that two nuddc acid sequences or polypq)tides are 
substantially identical is that the polypeptide encoded by the first nucleic add is 
immunologically cross reactive with the polypeptide needed by the second nucldc add, as 
described below. Thus, a polypeptide is typically substantially idoitical to a second 
polypeptide, for example, where the two peptides differ only by conservative substitutions. 
Another indication that two nucldc acid sequences are substantially identical is that the two 
molecules hybridize to each other under stringent conditions. 
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The term "naturally-occurring" is used to describe an object that can be found 
in nature. For example, a polypeptide or polynucleotide sequence that is present in an 
organism (including viruses) that can be isolated from a source in nature and which has not 
been intentionally modified by man in the laboratory is naturally-occurring. Generally, the 
S term naturally-occurring refers to an object as present in a non-pathological (undiseased) 
individual, such as would be typical for the species. 

Asexual recombination is recombination occurring without the fusion of 
gametes to form a zygote. 

A "mismatch repair deficient strain" can include any mutants in any organism 
10 impaired in the functions of mismatch repair. These include mutant gene products of mutS, 
mutT, mutH, mutL, ovrD, dcm, vsr, umuC, umuD, sbcB, recJ, etc. The impairment is 
achieved by genetic mutation, allelic replacement, selective inhibition by an added reagent such 
as a small compound or an expressed antisense RNA, or other techniques. Impairment can be 
of the genes noted, or of homologous genes in any organism 

15 m. VARIATIONS 

A. COATING FRAGMENTS WITH RECA PROTEIN 

The fi-equency of homologous recombination between library fi-agments and 

cognate endogenous genes can be increased by coating the Segments with a recombinogenic 

protein before introduction into cells. See Pati et al., Molecular Biology of Cancer 1, 1 

20 (1996); Sena & Zarling, Nature Genetics 3, 365 (1996); Revet et al., J. Mol. Biol. 232, 779- 
791 (1993); Kowalczkowski & Zarling in Gene Targeting (CRC 1995), Ch. 7. The 
recombinogenic protein promotes homologous pairing and/or strand exchange. The best 
characterized recA protein is fi^om K coli and is available firom Pharmacia (Piscataway, NJ). 
In addition to the wild-type protein, a nimiber of mutant recA-like proteins have been 

25 identified (e.g., recA803). Further, many organisms have recA-like recombinases with strand- 
transfer activities (e g,, Ogawa el al,, Cold Spring Harbor Symposium on Quantitative 
Biology 18, 567-576 (1993); Johnson & Symington, Mol Cell Biol 15, 4843-4850 (1995); 
Fugisawa et al., Nucl Acids Res. 13, 7473 (1985); Hsidi et al., Cell 44, 885 (1986); Hsieh et 
ai:, J. Biol Chem. 264, 5089 (1989); Fishel et al., Proc. Natl Acad Sci. USA 85, 3683 

30 (1988); Cassuto el al., Mol Gen. Genet 208, 10 (1987); Ganea et al., Mol Cell Biol 7, 3124 
(1987); Moore etal.,y.5/o/. Chem. 19, 11108 (1990);Keeneetal., Mic/. i4c/ds/tej. 12, 
3057 (1984); Kimiec, Cold Spring Harbor Symp. 48, 675 (1984); Kimeic, Ce// 44, 545 
(1986); Kolodner et al., Proc. Natl Acad Sci. USA 84, 5560 (1987); Sugino et al., Proc. 
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Natl Acad ScL USA 85, 3683 (1985); Halbrook et al., J. Biol Chem, 264, 21403 (1989);^ 
Eisen et al., Proc. Nail Acad ScL USA 85, 7481 (1988); McCarthy et al., Proc. Natl Acad, 
Scl USA 85, 5854 (1988); Lowenhaupt et al., 1 Biol Chem, 264, 20568 (1989). Examples 
of such recombinase proteins include recA, recA803, msX, (Roca, A.I., Crit Rev. Biochem, 
5 Moke, Biol 25, 415 {\990)\ sep\ (Kolodner et al., Proc. Natl Acad. ScL (U.S.A,) 84, 5560 
(1987); TishkofFet al., Molec. Cell Biol 1 1, 2593), RuvC (Dunderdale et al.. Nature 354, 
506 (1991)), D572, KEM\, XRN\ (Dykstra et al., Molec. Cell Biol 1 1, 2583 (1991)), 
STPdDSn (Clark et al., Molec. Cell Biol 1 1, 2576 (1991)), HPPA (Moore et al., Proc. 
Natl Acad ScL (U.S A) 88, 9067 (1991)), other eukaiyotic recombinases (Bishop et al., Cell 

10 69, 439 (1992); Shinohara et al.. Cell 69, 457. 

Rec\ protein forms a nucleoprotein filament when it coats a single-stranded 
DNA. In this nucleoprotein filament, one monomer of reck protein is bound to about 3 
nucleotides. This property of recA to coat single-stranded DNA is essentially sequence 
independent, although particular sequences favor initial loading of rec A onto a polynucleotide 

1 5 (e.g., nucleation sequences). The nucleoprotein filament(s) can be formed on essentially any 
DNA to be shuffled and can form complexes with both single-stranded and double-stranded 
DNA in prokaryotic and eukaiyotic cdls. 

Before contacting with rec A or other recombinase, Augments are often 
denatured, e.g., by heat-treatment. RecA protein is then added at a concentration of about 1- 

20 10 pM. After incubation, the recA-coated singl^stranded DNA is introduced into recipient 
cells by conventional methods, such as chemical transformation or dectroporatioa In general, 
it can be desirable to coat the DNA with a RecA homolog isolated fi-om the organism into 
which the coated DNA is bdng delivered. Recombination involves several cellular factors and 
the host RecA equivalent generally interacts better with other host &ctors than less closely 

25 related RecA molecules. The fragments undergo homologous recombination with cognate 
oidogenous genes. Because of the increased fi-equmcy of recombination due to recombinase 
coating, the fragments need not be introduced as components of vectors. 

Fragments are sometimes coated with other nuddc acid binding proteins that 
promote recombination, protect nuddc adds fi'om d^radation, or target nucldc adds to the 

30 nudeus. Examples of such proteins includes Agrobacterium virE2 (Durrenberger et al., Proc. 
Nail Acad ScL USA 86, 9154-9158 (1989)). AltOTiativdy, the recipiait strains are defident 
in RecD activity. Single stranded ends can also be generated by 3'-S' exomiclease activity or 
restriction enzymes producing S' overhangs. 

19 
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1. Aft//S selection 
The K coli mismatch repair protdn MutS can be used in affinity 

chromatography to enrich for fragments of double-stranded DNA containing at least one base 
of mismatch. The N&tS protein recognizes the bubble formed by the individual strands about 
5 the point of the mismatch. See, e.g., Hsu & Chang, WO 9320233. The strategy of affinity 
enriching for partially mismatched duplexes can be incorporated into the present methods to 
increase the diversity between an mcoming library of fragments and corresponding cognate or 
allelic genes in redpient ceOs. 

Fig. 2 shows one scheme in which MutS is used to increase diversity. The 

10 DNA substrates for enrichment are substantially similar to each other but differ at a few sites. 
For example, the DNA substrates can represent complete or partial genomes (e.g., a 
chromosome library) from diflFerent individuals with the differences being due to 
polymorphisms. The substrates can also represent induced mutants of a wildtype sequence. 
The DNA substrates are pooled, restriction digested, and denatured to produce fragments of 

15 single-stranded DNA. The single-stranded DNA is then allowed to reanneal. Some single- 
stranded fragments reanneal with a perfectly matched complementary strand to generate 
perfectly matched duplexes. Other single-stranded fiagments anneal to generate mismatched 
duplexes. The mismatched duplexes are enriched from perfectly matched duplexes by MutS 
chromatography (e.g., with MutS immobilized to beads). The mismatched duplexes recovered 

20 by chromatography are introduced into recipient cells for recombination vwth cognate 

endogenous gaies as described above. MutS aflSnity chromatography increases the proportion 
of fragments differing from each other and the cognate endogenous gene. Thus, 
recombination between the incoming fragments and endogenous genes results in greater 
diversity. 

25 Fig. 3 shows a second strategy for MutS enrichment. In this strategy, the 

substrates for MutS enrichment represent variants of a relatively short segment, for example, a 
gene or cluster of genes, in which most of the different variants differ at no more than a single 
nucleotide. The goal of MutS rarichment is to produce substrates for recoinbination that 
contain more variations than sequences occunipg in nature. This is adiieved by fragmenting 

30 the substrates at random to produce overlapping fluents. The fifagments are denatured and 
reannealed as m the first strategy. Reannealmg generates some mismatched duplexes whidi 
can be separated fix)m perfectly matched duplexes by MutS aflMty diromatography. As 
before, MutS chromatography enriches for duplexes bearing at least a single mismatch. The 
mismatched duplexes are then reassanbled rato longer finagments. This is accomplished by 

20 
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cycles of denaturation, reannealing, and chain extension of partially annealed duplexes (s€^ 
Section V). After several such cycles, fragments of the same length as the origmal substrates 
are achieved, except that these fragments differ from each other at multiple sites; These 
fragments are then introduced into cells wh^e they undergo recombination ivith cognate 
S endogenous genes. 

2. Positive Selection For Allelic Exchange 
The invention further provides methods of enriching for cdls bearing modified 

genes relative to the starting cells. This can be achieved by introducing a DNA fragment 

library (e.g., a ^gle specific segment or a whole or partial genomic libiaiy) in a suicide vector 

10 lacking a fimctional replication origin m the recipient cell type) containing both positive 

and negative selection markers. Optionally, multiple fragment libraries from different sources 
(e.g., B. subtilis, B. licheniformis and B. cereus) can be cloned into different vectors bearing 
different selection markers. Suitable positive sdection markers include mo^, kanamydn*^, 
hyg^ MsD, gpt^ hle^ te^. Suitable negative selection markers include hsv-ik^ hprt^ gp/, SacB 

1 S m73and cytoane deaminase. A variety of examples of conditional rq)lication vectors, 
mutations affecting vector replication, limited host range vectors, and countersdectable 
markers are found in Berg and Berg, supra^ and LaRossa, ibicL and the references therein. 

In one example, a plasmid with R€K and fl origins of replication, a positively 
selectable maiker (beta-lactamase), and a counterselectable marker (B. subtilis sacB) was 

20 used. M13 transduction of plasmids containing cloned genes were efllciently recombined into 
tiie chromosomal copy of that gene in a rep mutant £ coli strain. 

Another strategy for applying negative sdection is to include a mldtype rpsh 
gene (encoding ribosomal protein S12) in a vector for use in cdls having a mutant rpsL gene 
conf^Ting streptomycin resistance. The mutant form oirpsl, is recessive in cdls having 

25 wildtype rps'L. Thus, selection for Sm resistance selects against cells having a wildtype copy 
of rpsl,. See Skorupski & Taylor, Gene 169, 47-52 (1996). Alternatively, vectors bearing only 
a positive selection maricer can be used with one round of selection for cells expressing the 
markCT, and a subsequent roimd of screening for cells that have lost the marker (e.g., 
screening for drug sensitivity). The screen for cells that have lost the positive selection marker 

30 is equivalent to screening against expression of a negative selection marker. For example. 
Bacillus can be transformed with a vector bearing a CAT gene and a sequence to be 
integrated. See Harwood & Cutting, Mokcular Biological Methods for Bacillus, at pp. 3 1 - 
33. Selection for chloranq)h^col resistance isolates cells that have taken up vector. After a 
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suitable period to allow recombination, selection for CAT sensitivity isolates cells which have 
lost the CAT gene. About 50% of such cells will have undergone recombination with the 
sequence to be integrated. 

Suicide vectors bearing a positive selection marker and optionally, a negative 
5 selection marker and a DNA fragment can integrate into host chromosomal DNA by a single 
crossover at a site in chromosomal DNA homologous to the fragment. Recombination 
generates an integrated vector flanked by direct repeats of the homologous sequence. In some 
cells, subsequent recombination between the repeats results in excision of the vector and either 
acquisition of a desired mutation from the vector by the genome or restoration of the genome 
10 to wildtype. 

In the present methods, after transfer of the gene libraiy cloned in a suitable 
vector, positive selection is applied for expression of the positive selection maricer. Because 
nonintegrated copies of the suicide vector are rapidly eliminated from cells, this selection 
eamches for cells that have integrated the vector into the host chromosome. The cells 

1 5 surviving positive selection can then be propagated and subjected to negative selection, or 
screened for loss of the positive selection marker. Negative selection selects against cells 
expressing the negative selection marker. Thus, cells that have retained the integrated vector 
express the negative marker and are selectively eliminated. The cells surviving both rounds of 
sdection are those that initially integrated and then eliminated the vector. These cells are 

20 enriched for cells having genes modified by homologous recombination with the vector. This 
process diverrffies by a single exchange of genetic information. However, if the process is 
repeated either with the same vectors or with a library of fragments gen^ed by PCR of 
pooled DNA from the enriched recombinant population, resulting in the diversity of targeted 
genes being enhanced exponentially each round of recombinatioa This process can be 

25 repeated recursively, with selection being performed as desired. 

3 . Individualized Optimization of Genes 
In general, the above methods do not require knowledge of the number of 

genes to be optimized, their map location or their function. However, in some instances, 

where this information is available for one or more g«ie, it can be exploited. For example, if 

30 the property to be acquired by evolution is enhanced recombination of cells, one gene likely to 

be important is recA, even though many other genes, known and unknovm, may make 

additional contributions. In this situation, the recA gene can be evolved, at least in part, 

separately from otiier candidate genes. The recA gene can be evolved by any of tiie metiiods 
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of recursive recombination described in Section V. Briefly, this 2q3proach entails obtaining^ 
diverse forms of a recK gene, allowing the forms to recombine, selecting recombinants having 
improved properties, and subjecting the recombinants to further cycles of recombination and 
selection. At any point m the mdividualized improvement of recA, the diverse forms of rec A 
5 can be pooled with fragments encoding other genes in a library to be used in the g^eral 
methods described herein. In this way, the library is seeded to contain a higher proportion of 
variants in a gene known to be important to the propoty sought to be acquired than would 
otherwise be the case. 

In one example (illustrated in Fig. 203), a plasmid is constructed carrying a 

10 non-fiinctional (mutated) version of a chromosomal gene such as URA3^ where the wdld-type 
gene confers sensitivity to a drug (in this case S-fluoroorotic add). The plasmid also carries a 
selectable mailcer (resistance to another drug such as kanamycinX and a library oirecA 
variants. Transformation of the plasmid into the cell results in expression of the recA variants, 
some of which will catalyze homologous recombination at an increased rate. Those cells in 

IS which homologous recombination occurred are resistant to the selectable drug on the plasmid, 
and to S-fluoroorotic add because of the disruption of the chromosomal copy of this gene. 
The recA vaijants which give the highest rates of homologous recombination are the most 
higiUy represented in a pool of homologous recombinants. The mutant recA genes can be 
isolated from this pool by PCR, re-shufiQed, cloned back into the plasmid and the process 

20 repeated. Other sequences can be inserted in place of recA to evolve other components of the 
homologous recombination systCTi. 

4. Harvesting DNA Substrates for Shuffling 
In some shufiEUng methods, DNA substrates are isolated from natural sources 

and are not easily manipulated by DNA modifying or polymerizing enzymes due to recalcitrant 

25 impurities, which poison enzymatic reactions. Such difficulties can be avoided by processing 

DNA substrates through a harvesting strain. The harvesting strain is typically a cell type with 

natural competence and a capacity for homologous recombination between sequences with 

substantial diversity (e.g., sequences exhibiting only 7S% sequence identity). The harvesting 

strain bears a vector encoding a negative selection marker flanked by two segmrats 

30 respectively complementary to two segments flanking a gene or other region of interest in the 

DNA fix>m a target organism. The harvesting strain is contacted with fragments of DNA from 

the target organism. Fragments are taken up by natural competence, or other methods 

described herein, and a fi^gment of interest from the target organism recombines with the 
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vector of the harvesting strain causing loss of the negative selection marker Selection ag^nst 
the negative marker allows isolation of cells that have taken up the fragment of interest. 
Shuffling can be carried out in the harvester strain (e.g., a RecE/T strain) or vector can be 
isolated from the harvester strain for in vitro shufiling or transfer to a different cell type for in 
5 vivo shuflBing. Alternatively, the vector can be transferred to a diflferent cell type by 

conjugation, protoplast fusion or electrofusion. An ^cample of a suitable harvester strain is 
Acinetobacter calcoaceticus mutS. Melnikov and Youngman, (1999) Nucl Add Res 
27(4): 10S6-1062. This strain is naturally competent and takes up DNA in a nonsequence- 
specific manner. Also, because of the mutS mutation, this strain is capable of homologous 
10 recombination of sequences showmg only 75% sequence identity. 

IV. APPLICATIONS 

A. RECOMBINQGENICITY 

One goal of whole cell evolution is to generate cells having improved capacity 
for recombination. Such cells are useful for a variety of purposes in molecular genetics 

15 including the in vivo formats of recursive sequence recombination described in Section V. 
Almost thirty genes (e.g., recA, recB, recC, recD^ recE, recF, racG, recO, rccQ, recR, redJ^ 
ruvA, rwvB, ruvC, sbcB, ssb, top A, gyrA and B, lig^polAi, i/vrD, E, recL, mufD, mufR, muiL, 
mud, mufU, helD) and DNA sites (e.g., chi, recN, sbcC) involved in genetic recombination 
have been identified in K coli^ and cognate forms of several of these genes have been found in 

20 other organisms (e.g., radSl, rad55-rad57, Dmcl in yeast (see Kowalczykowski et al., 

Microbiol Rev. 58, 401-465 (1994); Kowalczkowski & Zarling, stq)ra) and human homologs 
of RadSl and Dmcl have been identified (see Sandler et al., Nucl Acids Res. 24, 2125-2132 
(1996)). At least some of the E, coli genes, including recA are functional in mammalian cells, 
and can be targeted to the nucleus as a fusion with SV40 large T antigen nuclear targeting 

25 sequence (Reiss et al., Proc. Natl Acad. Sci. USA, 93, 3094-3098 (1996)). Furth^, 
mutations in mismatch repair genes, such as mutJ^ mutS, muiH^ mud relax homology 
requirements and allow recombination between more diverged sequences (Rayssiguier et al.. 
Nature 342, 396-401 (1989)). The extent of recombination between divergent strains can be 
enhanced by impairing mismatch rq}air genes and stimulating SOS genes. Such can be 

30 achieved by use of appropriate mutant strains and/or growth under conditions of metabolic 
stress, ^ch have been found to stimulate SOS and inhibit mismatch repair genes. Vulic et 
al., Proc. Natl Acad. Scl USA 94 (1997). In addition, this can be achieved by impairing the 
products of mismatch rq)air genes by exposure to sdecth^e inhibitors. 
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Starting substrates for recombination are selected according to the general^ 
principles described above. That is, the substrates can be whole genomes or fractions thereof 
containing recombination genes or sites. Large libraries of essentially random fragments can 
be seeded with collections of fragments constituting variants of one or more known 
recombination genes, such as recA. Alternatively, libraries can be formed by mixing variant 
forms of the various known recombination genes and sites. 

The library of fragments is introduced into the recipient ceils to be improved 
and recombination occurs, generating modified cells. The recipient cells preferably contain a 
marker gene whose expression has been disabled in a manner that can be corrected by 
recombination. For example, the cells can contain two copies of a marker gene bearing 
mutations at di£fa^t sites, which copies can recombine to generate the wildtype gene. A 
suitable marker gene is green fluorescent protein. A vector can be constructed encoding one 
copy of GFP having stopcodons near the N-terminus, and another copy of GEP having 
stopcodons near the C-terminus of the protein. The distance between the stop codons at the 
respective ends of the molecule is 500 bp and about 25% of recombination events result in 
active GFP. Expression of GFP in a cell signals that a cell is capable of homologous 
recombination to recombine in between the stop codons to generate a contiguous coding 
sequence. By screenmg for cells expressing GFP, one enridies for cells having the highest 
capacity for recombiiudon. The same type of screen can be used following subsequent rounds 
of recombination. However, unless the selection marker used in previous round(s) was 
present on a suicide vector, subsequent round(s) should employ a second disabled sheening 
marker vAtiiin a second vector bearing a different origin of replication or a dififerent positive 
selection marker to vectors used in the previous rounds. 

B. MULTIGENQMIC COPY N UMBER-^GENE REDUNDANCY 

The majority of bacterial cells in stationary phase cultures grown in rich media 

contain two, four or genomes. In mnumal medium the cdls contain one or two genomes. 

The number of genomes per bact^ial cell thus depends on the growth rate of the cell as it 

enters stationary phase. This is because rapidly growing cells contain multiple rq>lication 

forks, resulting in sev^al genomes in the cells after termination. The number of graomes is 

strain dependent, althougjh all strains tested have more than one diromosome in stationary 

phase. Thenumberofgenomesinstadonary phase cells decreases with time. This appears to 

be due to fragmentation and degradation of entire chromosomes, similar to apoptosis in 

mammalian cells. This fragmentation of genomes in cells containing nnihiple genome copies 
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results in massive recombination and mutegenesis. Useful mutants may find ways to use 
energy sources that win allow them to continue growing. Multigenome or gene-redundant 
cdls are much more reastant to mutagenesis and can be improved for a selected trait faster. 
Some cell types, such as Deinococats radians (Daly and Minton J. Bacteriol 
5 177, 5495-5505 (1 995)) exhibit polyploidy throughout the ceU cycle. This cell type is highly 
radiation resistant due to the presence of many copies of the genome. High fi^uency 
recombination between the genomes allows r^id removal of mutations induced by a variefy of 
DNA damaging agents. 

A goal of the present methods is to evolve other cell types to have increased 
10 genome copy number aWn to that of Deindccocus radians. Preferably, the increased coj^ 
number is maintained through all or most of its cell cycle in all or most growth con<Utions. 
The presence of muWple genome copies in such cells resuks in a higher frequency of 
homologous recombination in these cells, both between copies of a gene in different genomes 
within the cell, and between a genome within the cell and a transfected fragment. The 
1 5 increased frequency of recombhiation allows the cells to be evolved more quiddy to acquire 
other useful characteristics. 

Starting substrates for recombination can be a diverse libraiy of genes only a 
few of which are relevant to genomic copy number, a focused Hbrary fonned fix)m variants of 
gene(s) known or suspected to have a role m goiomic copy number or a combination of the 
20 two. As a genwal rule one would expect inaeased copy number would be achieved by 

evolution of genes invoh^ed in replication and cdl septation such that ceU septation is inhibited 
without impairing repBcatioa Genes invoh^ m replication indude tus, xerC, xerD, dif, 
gyrA, gyfR,parE,parC, dif, TerA, TferB, TferC, 7fe/D, TerE, TerF, and genes influencing 
dttomosome partitioning and goie copy number include rmiD, mukk (tolC), mum, mukC, 
25 mukD, qxjQI}, ^THE (Wake & Errington, Annu. Rev. Gout. 29, 41-67 (1995)). A us^l 
source of substrates is the genom^ of a cell type such as Deinoccoais radians known to have 
the deared pheno^ of multigcnomic copy number. As well as, or instead o^ the above 
substrates, fragments encodmg protein or antisense KNA inhibitors to genes known to be 
involved in cell sq>tation can also be used. 

In nature, the existence of multiple goiomic copies in a cell type would usually 
not be advantageous due to the greater nutritional requirements needed to maintain this copy 
number. However, artificial conditions can be devised to select for high copy number. 
Modified ceUs having recombinant genomes are grown in rich media (in which conditions. 
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multicopy number should not be a disadvantage) and exposed to a mutagen, such as ultra>golet 
or gamma irradiation or a chemical mutagen, e.g., mitomycin, nitrous acid, photoactivated 
psoralens, alone or in combination, which induces DNA breaks amenable to repair by 
recombination. These conditions select for cells having multicopy number due to the greater 
efiSdency with which mutations can be excised. Modified cells surviving exposure to mutagen 
are enriched for cells with multiple genome copies. If desired, sdected cells can be 
mdividually analyzed for genome copy number (e.g., by quantitative hybri(Uzation with 
appropriate controls). Some or all of the collection of cells survimg selection pro^dde the 
substrates for the n^ round of recombinatioa In addition, individual cells can be sorted 
using a cell sorter for those cells containing' more DNA, e.g., using DNA spedfic fluorescent 
compounds or sorting for increased size using light dispersion. Evmtually cells are evolved 
that have at least 2, 4, 6, 8 or 1 0 copies of the genome throughout the cell cycle. In a similar 
manner, protoplasts can also be recombined. 

C. SECRETION 

The protdn (or metabolite) secretion pathways of bacterial and eukaryotic cells 
can be evolved to export desired molecules more efficiently, such as for the manu&cturing of 
protdn pharmaceuticals, small molecule drugs or spedalty chemicals. Improvements in 
efiSdenqr are particulariy desirable for proteins requiring multisubunit assembly (such as 
antibodies) or extensive posttransladonal modification before secretion. 

The efficiency of secretion may depend on a number of genetic sequences 
including a signal peptide coding sequence, sequences encoding protein(s) that cleave or 
otherwise recognize the coding sequence, and the coding sequence of the protein bdng 
secreted. The latter may affect folding of the protein and the ease with which it can integrate 
into and traverse membrane. The bacterial seoedon pathway in E coli include the SecPi^ 
SecB, SecE, SecD and SecF genes. In Bacillus subtilis, the major genes are secA, secD, secE, 
secF, secY, ffli, ftsY together with five signal pq)tidase genes (sipS, sipT, sipU, sipV and 
sipW) (Kunst et al, .n^ra). For proteins requiring posttransladonal modification, evolution of 
genes effecting sudi modification may contribute to improved secretion. Likewise genes with 
expression products having a role in assembly of muhisubunit proteins (e.g., chaperonins) may 
also contribute to improved secretion. 

Selection of substrates for recombination follows the general principles 
discussed above. In this case, the focused libraries referred to above comprise variants of the 
known secretion genes. For evolution of prokaryotic cells to express eukaryotic protdns, the 



27 



wo 00/04190 PCTAJS99/1 5972 , . 

initial substrates for recombination are often obtained at least in part from eukaryotic sources. 
Incoming fragments can undergo recombination both with chromosomal DNA in recipient 
cells and with the screening marker construct present in such cells (see below). The latter 
form of recombination is important for evolution of the signal coding sequence incorporated in 
5 the screening marker construct. Improved secretion can be screened by the inclusion of 
marker construct in the cells being evolved. The marker construct encodes a marker gene, 
operably linked to expression sequences, and usually operably linked to a signal peptide coding 
sequence. The marker gene is sometimes expressed as a fusion protein with a recombinant 
protein of interest. This approach is useful when one wants to evolve the recombinant protein 

10 coding sequence together with secretion genes. 

In one variation, the marker gene encodes a product that is toxic to the cell 
containing the construct unless the product is secreted. Suitable toxin proteins include 
diphtheria toxin and ridn toxin. Propagation of modified cells bearing such a construct selects 
for cells that have evolved to improve secretion of the toxin. Alternatively, the marker gene 

IS can encode a ligand to a known receptor, and cells bearing the ligand can be detected by 
FACS using labeled receptor. Optionally, such a ligand can be operably linked to a 
phosphoVpid anchoring sequence that binds the ligand to the cell membrane sur&ce following 
secretion. (See commonly owned, copending 08/309,345). In a fiirth^: variation, secreted 
marker protein can be maintained in proximity with the cell secreting it by distributing 

20 individual cells into agar drops. This is done^ e.g., by droplet formation of a cell suspension. 
Secreted protein is confined within the agar matrix and can be detected by e.g., FACS. In 
anoth^ variation, a protein of interest is expressed as a fiision protein together with b- 
lactamase or alkaline phosphatase. These enzymes metabolize commercially available 
chromogenic substrates (e.g., X-gal), but do so only after secredon into the periplasm. 

25 Appearance of colored substrate in a colony of cells therefore indicates capadty to secrete the 
fijsion protein and the intensity of color is related to the e£5ciency of secretion. 

The cells identified by these screening and selection methods have the capacity 
to secrete increased amounts of protan. This capacity may be attributable to increased 
secretion and increased expression, or from increased secretion alone. 

30 1. Expression 

Cdls can also be evolved to acquire increased expression of a recombinant 

protein. The levd of expression is, of course, highly dependent on the construct firom whidi 

the recombinant protein is e3q>ressed and the regulatory sequences, such as the promoter, 
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enhancer(s) and transcription termination site contained therein. Expression can also be 
affected by a large number of host genes having roles in transcription, posttranslational 
modification and translation. In addition, host genes involved in synthesis of ribonucleotide 
and amino add monomers for transcription and translation may have indirect eflFects on 
efficiency of expression. Selection of substrates for recombination follows the general 
prindples discussed above. In this case, focused libraries comprise variants of genes known to 
have roles in expression. For evolution of prokaiyotic cells to express eukaiyotic proteins, the 
initial substrates for recombination are often obtained, at least in part, from eukaiyotic 
sources; that is eukaiyotic genes encoding proteins such as ch^eronins involved in secretion 
and/assembly of proteins. Incoming fragnients can undergo recombination both with 
chromosomal DNA in recipient cells and with the screemng marker construct present in such 
cells (see bdow). 

Screening for improved expression can be effected by including a reporter 
construct in the cells bdng evolved. The reporter construct expresses (and usually secretes) a 
reporter protdn, such as GFP, which is easily detected and nontoxic. The reporter protein can 
be e?qpre$sed alone or together with a protein of interest as a fusion protein. If the reporter 
gene is secreted, the screening effectively selects for cells having dther improved secretion or 
improved e3q)res^on, or both. 

2. Plant Cells 

A further application of recursive sequence recombination is the evolution of 
plant cells, and transgenic plants derived from the same, to acquire resistance to pathogenic 
diseases (fimgi, viruses and bacteria), insects, chmiicals (such as salt, selenium, pollutants, 
pestiddes, herbiddes, or the like), induding, e.g., atrazine or glyphosate, or to mo^ 
chemical composition, yidd or the like. The substrates for recombination can again be whole 
genomic libraries, fractions th^eof or focused libraries containing variants of gene(s) known 
or suspected to confer redstance to one of the above agaits. Frequratly, Ubraiy fragments are 
obtamed from a different spedes to the plant bdng evolved. 

The DNA fragments are uitroduced into plant tissues, cultured plant cells, plant 
miCTOspores, or plant protoplasts by standard methods including electroporation (From et al., 
Proc. Natl AixuL ScL USA 82, 5824 (1985), infection by viral vectors such as cauliflower 
mosaic virus (CaMV) (Hohn et al.. Molecular Biology of Plant Tumors, (Academic Press, 
New York, 1982) pp. 549-560; Howell, US 4,407,956), high vdocity ballistic penetration by 
small partides with the nucleic add either within the matrix of small beads or particles, or on 
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the surface (Klein et al.. Nature 327, 70-73 (1987)), use of pollen as vector (WO 85/01856), 
or use of Agrobactehum tumefaciens or A. rhizogenes carrying a T-DNA plasmid in which 
DNA fragments are cloned. The T-DNA plasmid is transmitted to plant cells upon infection 
by Agrobacterium tumefaciens, and a portion is stably integrated into the plant genome 
(Horsch et al., Science 233, 496-498 (1984); Fraley et al., Proc, NatL Acad, Set USA 80, 
4803 (1983)). 

Diversity can also be generated by genetic exchange between plant protoplasts 
according to the same principles described below for fimgal protoplasts. Procedures for 
formation and fusion of plant protoplasts are described by Takahashi et al., US 4,677,066; 
Akaff et al., US 5,360,725; Shimamoto et lal.. Us 5,250,433; Cheney et al., US 5,426,040. 

After a suitable period of incubation to allow recombination to occur and for 
expression of recombinant genes, the plant cells are contacted with the agent to which 
resistance is to be acquired, and sur>aving plant cells are collected. Some or all of these plant 
cells can be subject to a fiuther round of recombination and screening. Eventually, plant cells 
having the required degree of resistance are obtained. 

These cells can then be cultured into transgenic plants. Plant regeneration from 
cultured protoplasts is described in Evans et al., "Protoplast Isolation and Culture," Handbook 
of Plant Cell Cultures 1, 124-176 (MacMillan Publishing Co., New York, 1983); Davqf, 
"Recent Developments in the Culture and Regeneration of Plant Protoplasts," Protoplasts^ 
(1983) pp. 12-29, (Bh-khauser, Basal 1983); Dale, Trotoplast Culture and Plant RegOTeration 
of Cereals and Other Recaldtrant Crops,'' Protoplasts (1983) pp. 31-41, (Birkhauser, Basel 
1983); Binding, "Regeneration of Plants," Plant Protoplasts, pp. 21-73, (CRC Press, Boca 
Raton, 1985). 

In a variation of the above method, one or more preliminary rounds of 
recombination and screraiing can be performed in bacterial cells according to the same gen^ 
strategy as desaibed for plant cells. More rapid evolution can be achieved in bacterial cells 
due to their greater growth rate and the greater efficiency with which DNA can be introduced 
into such cells. After one or more rounds of recombination/screening, a DNA fragment library 
is recovered from bacteria and transfonned mto the plant cells. The library can either be a 
complete library or a focused library. A focused library can be produced by amplification from 
primers specific for plant sequences, particularly plant sequences known or suspected to have 
a role in conferring resistance. 
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i. Example: Concatemeri c Assembly of Atrazine-Catabolianp Pla58niid 
Pseudomonas atraane catabolizing genes AtzA and AtzB were subcloned from 

pMDl (deSouza et al., Appl Emiron. Microbiol 61, 3373-3378 (1995); de Souza et al., J. 
BacterioL 178, 4894-4900 (1996)) into pUC18. A 1.9 kb Aval fragment containing AtzA 
was end-fflled and inserted into an Aval site of pUC18. A 3.9 kb Clal fragment containing 
AtzB was end-filled and cloned into the Mncn site of pUC18. AtzA was then excised from 
pUC18 with EcoRI and BamHI, Azfi with BairiHI and Hindm, and the two inserts were co- 
ligated into pUC18 digested with EcoRI and Hindm. The result was a 5.8 kb insert 
contMning AtzA and AtzB in pUC18 (total plasmid size 8.4 kb). 

Recursive sequence recombination was performed as follows. The entire 8.4 
kb plasmid was treated with DNasel in 50 mM Tris-Cl pH 7.5, 10 mM MnCh and fragments 
between 500 and 2000 bp were gel purified. The fragments were assembled in a PCR reaction 
using Tth-XL enzyme and buffer from Perkin Ehner, 2.5 mM MgOAc, 400 dNTPs and 
serial dilutions of DNA fragments. The assembly reaction was performed in an MJ Research 
"DNA Engine" programmed with the following cycles: 1) 94°C, 20 seconds; 2) 94T, 15 
seconds; 3) 40**C, 30 seconds; 4) 72T, 30 seconds + 2 seconds per cyde; 5) go to step 2, 39 
more tunes; 6) 4*'C. 

The AtzA and AtzB genes were not amplified fix)m the assembly reaction using 
the polymerase chain reaction, so instead DNA was purified from the reaction by phenol 
extraction and ethanol precipitation, then digested the assembled DNA with a restriction 
enzyme that linearized the plasmid (Kpnl: the Kpnl site in pUClS was lost during subcloning, 
leaving only the I^nl site in AtzA). Linearized plasmid was gel-purified, self-ligated 
overnight and transformed into E. coli strain NM522. (The choice of host strain was relevant: 
very little plasmid of poor quality was obtained from a number of other commerdally available 
strains including TGI, DHIOB, DH12S.) 

Serial dilutions of the transformation reaction were plated onto LB plates 
containing 50 jig/ml ampicillin, the remainder of the transformation was made 25% in glycerol 
and frozen at -80"C. Once the transformed cells were titered, the frozen cells wctc plated at a 
density of between 200 and 500 on 150 mm diameter plates containing 500 ng/ml atrazine and 
grown at 37*C. 

Atrazine at 500pg/ml forms an insoluble precipitate. The products of the AtzA 
and AtzB genes transform atrazine into a sohible product. Cells containing the wild type AtzA 
and AtzB genes in pUCl8 win thus be surrounded by a clear halo where the atra^ 
degraded. The more actwe the AtzA and AtzB enzymes, the more rapidly a clear halo will 
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form and grow on atrazine-containing plates. Positives were picked as those colonies th«^t. 
most r^idly formed the largest clear zones. The (approximately ) 40 best colonies were 
picked, pooled, grown in the presence of 50 ng/ml anq)icillin and plasmid prepared from them. 
The entire process (from DNase-treatment to plating on atrazine plates) was repeated 4 times 
with 2000-4000 colonies/cycle. 

A modification was made in the fourth round. Cells were plated on both 500 
jig/ml atrazine, and 500 ng/ml of the atrazine analogue terbutylazine, which was undegradable 
by the wild type AtzA and AtzB genes. Positives were obtained that degraded both 
compounds. The atrazine chlorohydrolase (product of AtzA gene) was 10-100 fold higher 
than that produced by the wildtype grae. 

p. PLANT GENOME SHUFFLING . 

I*Iant genome shuflEling allows recursive qrcles to be used for the introduction 
and recombination of genes or pathways that confer improved properties to desired plant 
spedes. Any plant spedes, including weeds and wild cultivars, showmg a desired trait, such as 
herbicide resistance, salt tolerance, pest resistance, or temperature tolerance, can be used as 
the source of DNA that is introduced into the crop or horticultural host plant species. 

Genomic DNA prepared from the source plant is fragmented (e.g. by DNasel, 
restriction enzymes, or mechanically) and cloned into a vector suitable for making plant 
genomic hT)raries, such as pGA482 (An. G., 1995, Methods Mol Biol. 44:47-58). This vector 
contains the A. iumefaciens left and right borders needed for gene transfer to plant cells and 
antibiotic markers for selection xnK coli, Agrobacterium, and plant cells. A multicloning site 
is provided for insertion of the genomic fragments. A cos sequence is present for the eflSdent 
packa^ng of DNA into bacteriophage lambda heads for transfection of the primary library into 
K coli. The vector accepts DNA fragments of 25-40 kb. 

The primary library can also be directly dectroporated into an A, tumefaciens 
or A, rhizogenes strain that is used to infect and transform host plant cells (Main, GD et al., 
1995, Methods Mol. Biol. 44:405-412). Alternatively, DNA can be introduced by 
electroporation or PEG-mediated uptake into protoplasts of the redpient plant spedes (Bilang 
et al. (1994) Plant Mol. B iol Manual. Kluwer Academic Publishers, Al:l-16) or by particle 
bombardment of cells or tissues (Christou, ibid, A2:l-15). If necessary, antibiotic maricers in 
the T-DNA region can be eliminated, as long as selection for the trait is possible, so that the 
final plant products contain no antibiotic genes. 
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Stably transformed whole cells acquiring the trait are selected on solid or Ijcjuid 
media containing the agent to which the introduced DNA confers resistance or tolerance. If 
the trait in question cannot be selected for directly, transformed cells can be selected with 
antibiotics and allowed to form callus or regenerated to whole plants and then screened for the 
5 desired property. 

The second and further cycles consist of isolating genomic DNA from each 
transgenic line and introducing it into one or more of the other transgenic lines. In each 
round, transformed cells are selected or screened for incremental in4)rovement. To speed the 
process of usbg multiple cycles of transfonnation, plant regraeration can be deferred untfl the 

10 last round. Callus tissue generated from the protoplasts or transformed tissues can serve as a 
source of genomic DNA and new host cells. After the final round, fertile plants are 
regenerated and the progeny are selected for homozygosity of the inserted DNAs. Ultimately, 
a new plant is created that canies multiple inserts which additively or synergistically combine 
to confer high levels of the de^ed trait. Alternatively, microspores cm be isolated as 

IS homozygotes gen^ated from spontaneous diploids. 

In addition, the introduced DNA that confers the desired trait can be traced 
because it is flanked by known sequences in the vector. Either PGR or plasmid rescue is used 
to isolate the sequences and characterize them in more detail. Long PCR (Foord, OS and 
Rose, EA, 1995, PCR Primer A Laboratory ManuaL CSHL Press, pp 63-77) of the full 25-40 

20 kb insert is achieved with the proper reagents and techniques using as primers the T-DNA 
border sequences. If the vector is modified to contain the E. coli origin of replication and an 
antibiotic marker b^een the T-DNA borders, a rare cutting restriction enzyme, such as NotI 
or Sfil, that cuts only at the ends of the inserted DNA is used to create fragments containing 
the source plant DNA that are then self-ligated and transformed into K coli where they 

25 repUcate as plasmids. The total DNA or subfragment of it that is responsible for the 

transferred tnut can be subjected to in vitro evolution by DNA shuffling. The shuffled library 
can be rdteratively recombmed by any method herein and then introduced into host plant cells 
and screened for improvement of the trait. In this way, single and multigene traits can be 
transferred from one species to another and optimized for higher expression or activity leading 

30 to whole organism improvement. This entire process can also be reiteratively repeated. 

Alternatively, the cells can be transformed microspores with the regenerated 
haploid plants being screened directly for improved traits as noted below. 
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E. MICROSPORE MANIPULATION 

Nficrospores are haploid (In) male spores that develop into pollen grains. ^ 
Anthers contain a large numbers of microspores in early-uninucleate to first-mitosis stages. 
Mcrospores have been successfully induced to develop into plants for most species, such as, 
e.g., rice (Chen, CC 1977 In Vitro. 13: 484-489), tobacco (Atanassov, I. et al. 1998 Plant Mol 
Biol. 38: 1 169-1 178), Tradescantia (Savage JRK and Papworth DG. 1998 Mutat Res. 
422:3 13-322), Arabidopsis (Park SK et al. 1998 Development 125:3789-3799), sugar beet 
^ajewska-Sawka A and Rodrigues-Garcia MI 1996 J Cell Sci. 109:859-866), Barley (Olsen 
FL 1991 Heredhas 1 15:255-266) and oilseed rape (Boutillier KA et al. 1994 Plant Mol Biol. 
26:1711-1723). 

The plants derived from microspores are predominantly haploid or diploid 
(infrequently polyploid and aneuploid). The diploid plants are homozygous and fertile and can 
be generated in a relatively short time. Microspores obtained from Fl hybrid plants represent 
great diversity, thus being an excellent model for studying recombination. In addition, 
microspores can be transformed with T-DNA introduced by agrobacterium or other available 
means and then regenerated into individual plants. Furthermore, protoplasts can be made from 
microspores and they can be fused similar to what occur in fimgi and bacteria. 

Microspores, due to their complex ploidy and regenerating ability, provide a 
tool for plant whole genome shuffling. For example, if pollens from 4 parents are collected 
and pooled, and then used to randomly pollinate the parents, the progenies should have 2* = 
16 possible combinations. Assuming this plant has 7 chromosomes, microspores collected 
from the 16 progenies will represent 2^16 = 2048 possible chromosomal combinations. This 
nimiber is even greater if meiotic processes occur. When diploid, homozygous embryos are 
generated from these microspores, in many cases, they are screened for desired phenotypes, 
such as herbicide- or disease- resistant. In addition, for plant oil composition these embryos 
can be dissected into two halves: one for analysis the other fbr regeneration into a viable plant. 

Protoplasts goierated from microspores (espedally the haploid ones) are 
pooled and fused. Microspores obtained from plants generated by protoplast fusion are 
pooled and fiised again, increasing the genetic diversity of the resulting microspores. 

Nficrospores can be subjected to mutagenesis in various ways, such as by 
chemical mutagenesis, radiation-induced mutagenesis and, e.g., t-DNA transformation, prior 
to fusion or regeneration. New mutations whidi art genmited can be recombined through the 
recursive processes described above and h^ein. 
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F. EXAMPLE: ACOUISITIQN OF SALT TQLERANCF 

As depicted in Fig. 21, DNA from a salt tolerant plant is isolated and used to 

create a genomic library. Protoplasts made from the recipient species are 
transformed/transfected with the genomic libraiy (e.g., by electroporation, agrobacteriwn, 
etc.). Cells are selected on media with a nonnallyinhibttoiy level of NaCl. Only the cells with 
newly acquired salt tolerance vnll grow into caOus tissue. The best lines are chosen and 
genomic libraries are made Scorn their pooled DNA. These Ubraiies are transformed into 
protoplasts made from the first round transformed calli. Again, cells are selected on increased 
salt concentrations. After the desired level of salt tolerance is achieved, the callus tissue can 
be induced to regenerate whole plants. Progeny of these plants are typically analyzed for 
homozygosity of the inserts to ensure stability of the acquired trait. At the indicated steps, 
plant regeneration or isolation and shufiQing of the introduced genes can be added to the 
overall protocol. 

G. TRANSGENIC ANIMALS 

1. Transgene Optimization 
One goal of transgenesis is to produce transgenic animals, such as mice, 

rabbits, sheep, pigs, goats, and cattle, secreting a recombinant protein in the milk. A transgene 
for this purpose typically comprises in operable linkage a promoter and an enhancer from a 
milk-protein gene (e.g., a, P, or y casein, p-lactoglobulin, acid whey protein or a-lactalbumin), 
a signal sequence, a recombinant protein coding sequence and a transcription termination site. 
Optionally, a transgene can encode multiple chains of a multichain protein, such as an 
inmiunoglobulin, in which case, the two chains are usually individually operably linked to sets 
of regulatory sequences. Transgenes can be optimized for expression and secretion by 
recursive sequence recombination. Suitable substrates for recombination include regulatory 
sequences such as promoters and enhancers from milk-protein genes from different species or 
individual animals. Cycles of recombination can be p^ormed in vitro or in vivo by any of the 
formats discussed in Section V. Screening is performed in vivo on cultures of mammary-gland 
derived cells, such as HCl 1 or MacT, transfected with transgenes and reporter constructs such 
as those discussed above. After several cycles of recombination and screening, transgenes 
resulting in the highest levels of e}q)Fession and secretion are extracted from the mammary 
^and tissue culture cells and used to transfect anbryonic cells, such as :^gotes and embryonic 
stem cells, which are matured into transg^c animals. 



35 



wo 00/04190 PCT/US99A15972 

2. Whole Animal Optimization 
In this approach, libraries of incoming fragments are transformed into 

embiyonic cells, such as ES cells or zygotes. The fragments can be variants of a gene known 

to confer a desired property, such as growth hormone. Alternatively, the fragments can be 

5 partial or complete genomic libraries including many genes. 

Fragments are usually introduced into zygotes by microinjection as described 

by Gordon et si.. Methods Enzymol. 101, 414 (1984); Hogan et al., Mmiipulation of the 

Mouse Embryo: A Laboratory Manual (C.S.H.L. N.Y., 1986) (mouse embryo); and Hammer 

et al.. Nature 315, 680 (1985) (rabbit and porcine embryos); Gandolfi et al., /. Reprod Pert. 

10 81, 23-28 (1987); Rexroad et al,, 1 Anint. ScL 66, 947-953 (1988) (ovine embiyos) and 

Eyestone et al., J. Reprod Pert. 85, 715-720 (1989); Camous et al,, J. Reprod Pert 72, 779- 
785 (1984); and Heyman et al., Theriogenology 27, 5968 (1987) (bovine embryos). Zygotes 
are then matured and introduced into recipient female animals which gestate the embryo and 
give birth to a transgenic offspring. 

15 Alternatively, transgenes can be introduced into embryonic stem cells (ES), 

These cells are obtained from preimplantation embryos cultured in vitro. Bradley et al.. 
Nature 309, 255-258 (1984). Transgenes can be introduced into such cells by electroporation 
or microinjection. Transformed ES cdls are combined with blastocysts from a non-human 
animal. The ES cells colonize the embryo and in some embryos form the germ line of the 

20. resulting chimeric animal. See Jaenisch, Science, 240, 1468-1474 (1988). 

Regardless whether :^gotes or ES are used, screening is performed on whole 
animals for a desired property, such as increased size and/or growth rate. DNA is extracted 
from anhnals having evolved toward acquisition of the desn*ed property. This DNA is then 
used to transfect further embryonic cells. These cells can also be obtained from animals that 

25 have acquired toward the desired property in a split and pool approach. That is, DNA from 
one subset of such aiumals is transformed into mibryonic cells prepared from another subset 
of the animals. Ahematively, the DNA fh>m animals that have evolved towanl acqui^on of 
the desired property can be transfected into fresh embryonic cells. In dther alternative, 
transfected cells are matured into transgenic animals, and the animals subjected to a further 

30 round of sheening for the desired property. 

Fig. 4 shows the application of this approach for evolving fish toward a larger 
size. Initially, a iibraiy is prepared ofyariants of a growth hormone gene. The variants can be 
natural or induced. The library is coated with recA protein and transfected into fertilized fish 
eggs. The fish eggs then mature into fish of different sizes. The growth hormone gene 
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fragment of genomic DNA from large fish is then amplified by PGR and used in the next round 
of recombination. Alternatively, fish a-DFN is evolved to enhance resistance to viral infections 

as described below. 

3. Evolution of improved hormones for expression in transgenic 
animals (e.g., Fish^ to create animals with improved traits. 
Hormones and cytokines are key regulators of size, body weight, viral 

resistance and many other commercially important traits. DNA shufSing is used to rapidly 

evolve the genes for these proteins using in vitro assays. This was demonstrated with the 

evolution of the human alpha interferon genes to have potent antiviral activity on murine cells. 

Large improvements in activity were achieved in two cycles of family shuffling of the human 

IFN genes. 

In general, a method of increasing resistance to vims infection in cells can be 
performed by first mtroducing a shuflSed library comprising at least one shuflQed interferon 
gene into animal cells to create an initial library of animal cells or animals. The initial library is 
then challenged with the virus. Animal cells or animals are selected from the initial library 
which are resistant to the virus and a plurality of transgenes from a plurality of animal cells or 
animals which are resistant to the virus are recovered. The plurality of transgenes is recovered 
to produce an evolved library of animal cells or animals which is again challenged with the 
virus. Cells or animals are selected from the evolved library the which are resistant to the 
virus. 

For example, genes evolved with in vitro assays are introduced into the 
germplasm of animals or plants to create improved strains. One limitation of this procedure is 
that in ^dtro assays are often only crude predictors of in vivo activity. However, wth 
improAong methods for the production of transgenic plants and animals, one can now marry 
whole organism breeding with molecular breeding. The approach is to introduce shuffled 
libraries of hormone gmes into the spedes of interest. This can be done with ia single gene per 
transgenic or wifh pools of genes per transgenic. Progeny are then screwed for the phenotype 
of interest. In this case, shuffled Ebraries of interferon genes (alpha IFN for example) are 
introduced into transgenic fish. Thelibraryoftransgenic fish are challenged with a virus. The 
most resistant fish are identified (i.e. either survivors of a lethal challenge; or those that are 
deemed most □healthy' after the challenge). The IFN transgenes are recovered by PGl and 
shuffled in either a poolwise or a pairwise fashion. This generates an evolved library of IFN 
genes. A second library of transgenic fish is created and the process is repeated. In this way, 
IFN is evolved for improved antiviral activity in a whole organism assay. 
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This procedure is general and can be applied to any trait that is affected by a 
gene or gene family of interest and which can be quantitatively measured. 

Fish interferon sequence data is available for the Japanese flatfish 
(Paralichihys olivaceus) as mRNA sequence (Tamai et al (1993) "Cloning and expression of 

5 flatfish (Paralichthys olivaceus) interferon cDNA." Biochem. Biophvs. Acta 1 174, 182-186; 
see also^ Tanw ei al (1993) 'TPuiification and characterization of interferon-like antiviral 
protein derived fi'om flatfish {Paralichthys olivaceiis) lymphocytes immortalized by 
oncogenes." Cvtotechnologv 1993; 1 1 (2): 121-131). This sequence can be used to clone out 
IFN genes fi'om this species. This sequence can also be used as a probe to clone homologous 

10 interferons fit)m additional species of fish. As well, additional sequence information can be 
utilized to clone out more species of fish interferons. Once a library of interferons has been 
cloned, these can be femily shufiSed to generate a library of variants. 

A Protein sequence of flatfish interf^on is: 
MIRSTNSNKS DILMNCHHLDR YDDNSAPSGGSL FRKMIMLLKL LKLITFGQLRW 

15 ELFVKSNTSKTS TVLSmGSNLISL LDAPKDILDKPSCNSF QLDLLLASSAWTLLT 
ARLLNYPYPA VLLSAGVASWLVQVP. 

In one embodiment, BHK-21 (A fibroblast cell line fcom hamster) can be 
transfected with the shuffled IFN-expression plasmids. Active recombinant IFN is produced 
and then purified by WGA agarose affinity chromatography (Tamai, et al. 1993 Biochim 

20 Ciophys Acta, supra). The antiviral activity of IFN can be measured on fish cells challenged 
by rhabdoviurs. Tami et al. (1993) ^Purification and characterization of interferon-Uke 
anti\nral protein derived firom flatfish (Paralichthys olivaceus) lymphocytes immortalized by 
oncogenes." Cvtotechnologv 1993; 1 1 (2): 121-131). 

H. WHOLE GENOME SHUFFLING IN HIGHER ORGANISMS-- 
25 POOLWISE RECURSIVE BREEDING 

The present invention provides a procedure for gmerating large combinatorial 

libraries of higher eukaiyotes, plants, fish, domesticated animals, etc. In addition to the 

procedures outlined above, poolwise combination of male and female gametes can also be 

used to generate large diverse molecular libraries. 

30 In one aspect, the process includes recurave poolwise madngs for several 

generations without any deliberate screening. This is similar to classical breeding, except that 

pools of organisms, rather than pairs of organisms, are mated, therd>y accderatmg the 

generation of genetic diversity. 
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This method is similar to recursive fusion of a diverse population of bacterial 
protoplasts resulting in the generation of multiparent progeny harboring genetic information 
from all of the starting population of bacteria. The process described h^e is to perform 
analogous artificial or natural matings of large populations of natural isolates, imparting a split 
5 pool mating strategy. Before mating, all of the male gametes i.e. pollen, sperm, etc., are 
isolated from the starting population and pooled. These are then used to *'self' fertilize a 
mixed pool of the female gametes from the same population. 

The process is rq)eated with the subsequent progeny for several generations, 
with the final progeny being a combinatorial organism library with each member having 
10 genetic information orig^oiating from many if not all of the starting **parents." This process 
generates large diverse organism libraries on which many selections and or screens can be 
imparted, and it does not require sophisticated m vitro manipulation of genes. However, it 
results in the creation of usefiil new strains (perhaps well diluted in the population) in a much 
shorter time fiame than such organisms could be generated using a classical targeted breeding 
IS approach! 

These libraries are generated relatively quickly (e.g., typically in less than three 
years for most plants of commercial interest, with sbc cycles or less of recursive breeding being 
sufiBcient to generate desired diversity). 

An additional benefit of tiiese methods is that the resulting libraries provide 
20 organismal diversity in areas, such as agricuhure, aquaculture, and animal husbandry, that are 
currently genetically homogmeous. 

RKamples of these methods for several organisms are described below. 

1. Plants 

A population of plants, for example all of the difiT^rat com strains in a 
25 commercial seed/germplasm collection, are grown and the pollen from the entire population is 
harvested and pooled. This mbced pollen population is then used to "selT fertilize the same 
populatioa Self pollination is prevented, so that the fertilization is combinatorial. The cross 
results in all pairwise crosses possible within the population, and the resulting seeds result in 
many ofthe possible outcomes of each of these pairwise crosses. The seeds fix>m the fertilized 
30 plants are then harvested, pooled, planted, and the pollen is again harvested, pooled, and used 
to "self^ fertilize the population. After only several generations, the resulting population is a 
very diverse combinatorial library of com. The seeds from this library are harvested and 
screened for desirable traits, e.g., sah tolerance, growth rate, productivity, yield, disease 
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resistance, etc. Essentially any plant collection can be modified by this approach. Importapt 
commercial crops include both monocots and dicots. Monocots include plants in the grass 
family {Gramineae), such as plants in the sub families Fetucoideae and Poacoideae, which 
together include several hundred genera including plants in the genera Agrostis, Phleum, 
5 Dactylis, Sorgum. Setaria, Zea (e.g., com), Oryza (e.g., rice), Triticum (e.g., wheat), Secale 
(e.g., rye), Avena (e.g., oats), Hordeum (e.g., barley), Saccharum, Poa, Fesiuca, 
Stenotcphrum^ Cynodon, Coix, the Olyreae^ Phareae and many others. Plants in the fiunily 
Gramimae are a particularly preferred target plants for the methods of the invention. 
Additional preferred targets include other commercially important crops, e.g., firom the 

10 femilies Compositae (the largest family of vascular plants, including at least 1,000 genera, 
including important commercial crops such as sunflower), and Leguminosae or "pea family," 
which includes several hundred genera, including many commercially valuable crops such as 
pea, beans, lentil, peanut, yam bean, cowpeas, velvet beans, soybean, clover, al&lfa, lupine, 
vetch, lotuSy sweet clover, wisteria, and sweetpea. Common crops applicable to the methods 

IS of the invention include Zea mays^ rice, soybean, sorghum, wheat, oats, barley, millet, 
sunflower, and canola. 

TUs process can also be carried out udng pollen fi'om different species or more 
divergent strains (e.g., crossing the andent grasses with com). Different plant species can be 
forced to cross. Only a few plants fi'om an initial cross would have to result in order to make 

20 the process viable. These few progeny, e.g., firom a cross between soy bean and com, would 
generate pollen and eggs, each of whidi would represent a dififerent meiotic outcome fi^m the 
recombination of the two genomes. The pollen would be harvested and used to ""self 
pollinate the original progeny. This process would then be carried out recursively. This would 
genCTate a large fiunily shuffled library of two or more species, which could be subsequently 

25 screened. 

The above strategy is illustrated schematically in Figure 30. 

2. Fish 

The natural tendmcy of fish to lay thdr eggs outside of the body and to have a 
male cover those eggs with spemi provides another opportunity for a split pooled breeding 
30 strategy. The eggs from many diff(^ent fish, e.g., salmon from different fisheries about the 
worid, can be harvested, pooled, and then fertilized with similarly coUected and pooled 
salmon spera^ The fertilization will result in all of the possible pairwise matings of the starting 
population. The resuhing progeny is then grown and again the sperm and eggs are harvested, 
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and pooled, with each egg and sperm representing a different meiotic outcome of the diff^nt 
crosses. The pooled sperm are then used to fertilize the pooled eggs and the process is carried 
out recursively. After several generations the resulting progeny can then be subjected to 
selections and screens for desired properties, such as size, disease resistance, etc. 
S The above strategy is illustrated schematically in Figure 29. 

3. Animals 

The advent of in vitro fertilization and surrogate motherhood provides a means 
of whole genome shufiBing in animals such as mammals. As with fish, the ^s and the sperm 
fi-om a population, for example fi-om all slaughter cows, are collected and pooled. The pooled 

10 eggs are then in vitro fertilized with the pooled sperm. The resulting embryos are then 

returned to surrogate mothers for development. As above, this process is repeated recursively 
until a large diverse population is generated that can be screened for desirable traits. 

A technically feasible approach would be similar to that used for plants. In this 
case, sperm torn the males of the starting population is collected and pooled, and then this 

15 pooled sample is used to artificially insemmate multiple females fix>m each of the starting 
populations. Only one (or a few) sperm would succeed in each ammal, but these should be 
dififerent for each fertilization. The process is reitmted by harvestuig the sperm fi-om all of the 
male progeny, pooling it, and u^g it to fertilize all of the female progeny. The process is 
carried out recursively for several generations to generate the organism library, which can then 

20 be screened. 

I. RAPID EVOLUTION AS A PREDICTIVE TOOL 

Recursive sequence recombuiation can be used to simulate natural evolution of 

pathogenic microorganisms in response to exposure to a drug under test. Using recursive 

sequence recombination, evolution proceeds at a faster rate tiian in natural evolution. One 

25 measure of the rate of evolution is the number of cycles of recombination and screening 
required until the microorganism acquires a defined level of resistance to the drug. The 
information fit>m this analysis is of value in comparing the relative merits of different drugs 
and in particular, in predicting their long term efficacy on repeated administration. 

The pathogenic microorganisms used in this analysis include the bacteria that 

30 are a common source of human infections, such as chlamydia, rickettsial bacteria, 

mycobacteria, stcphylococci, streptocci, pneumorwcocci, meningococci and conococci, 
Uebsiella, proteus, serratia, pseudomonas, legionella, diphtheria, salmonella, bacilli, 
cholera, tetanus, botulism, anthrax, plague, leptospirosis, and Lymes disease bacteria. 
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Evolution is eflFected by transforming an isolate of bacteria that is sensitive to a drug under test 
with a library of DNA fragments. The fragments can be a mutated version of the genome of 
the bacteria being evolved. If the target of the drug is a known protein or nucleic acid, a 
focused library containing variants of the corresponding gene can be used. Alternatively, the 
S library can come from other kinds of bacteria, especially bacteria typically found inhabiting 
human tissues, thereby simulating the source material available for recombination in vivo. The 
library can also come from bacteria known to be resistant to the drug. After transformation 
and propagation of bacteria for an appropriate period to allow for recombination to occur and 
recombinant genes to be expressed, the bacteria are screened by exposing them to tiie drug 

10 under test and then collecting survivors. Stuviving bacteria are subject to further rounds of 
recombination. The subsequent round can be effected by a split and pool approach in which 
DNA from one subset of surviving bacteria is introduced into a second subset of bacteria. 
Alternatively, a fresh libraiy of DNA fragments can be mtroduced into surviving bacteria. 
Subsequent round(s) of selection can be perfrirmed at increasing concentrations of drug, 

1 5 thereby increasing the stringency of selection. 

A similar strategy can be used to simulate viral acquisition of drug resistance. 
The object is to identify drugs for which resistance can be acquired only slowfy, if at all. The 
viruses to be evolved are those that cause infections in humans for ^ch at least modestly 
effective drugs are available. Substrates for recombination can come from induced mutants, 

20 natural variants of the same viral strainer dififerent viruses. Ifthe target ofthe drug is known 
(e.g., nucleotide analogs which inhibit the reverse transcriptase gene of HTV), focused libraries 
containing variants of the target gene can be produced. Recombination of a viral genome with 
a library of fragments is usually performed in vitro. However, in situations in which the library 
of fragments constitutes variants of viral genomes or fragments that can be encompassed in 

25 such genomes, recombiriation can also be performed in vivo, e.g., by transfecting cells with 
multiple substrate copies (see Section V). For screening, recombinant viral genomes are 
introduced into host cells susceptible to infection by the virus and the cells are exposed to a 
drug efifective against the virus (initially at low concentration). The cells can be spun to 
remove any noninfected virus. After a period of infection, progeny viruses can be collected 

30 from the culture medium, the progeny viruses being enriched for viruses that have acquired at 
least partial resistance to the drug. Alternatively, viraDy infected cdls can be plated in a soft 
agar lawn and resistant viruses isolated from plaques. Plaque size provides some indication of 
the degree of viral resistance. 
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Progeny viruses surviving screening are subject to additional rounds of ^ 
recombination and screening at increased stringency until a predetermined level of drug 
resistance has been acquired. The predetermined level of drug resistance may reflect the 
maximum dosage of a drug practical to administer to a patient without intolerable side effects. 
5 The analysis is particularly valuable for investigating acquisition of resistance to various 
combination of drugs, such as the growing list of approved anti-HIV drugs (e.g., AZT, ddl, 
ddC, d4T, TTOO 82150, nevaripine, 3TC, crixivan and ritonavir). 

J. THE EVOLUTIONARY IMPORTANCE OF RECOMBINATION 
Strain improvement is tiie directed evolution of an organism to be more '^fit** 

10 for a desired task. In nature, adaptation is facilitated by sexual recombmation. Sexual 
recombination allows a population to exploit the genetic diversity within it, e.g., by 
consolidating useful mutations and discarding deleterious ones. In this way, adaptation and 
evolution can proceed in leaps. In the absence of a sexual cycle, members of a population 
must evolve independently by accumulating random mutations sequentiaUy. Many useful 

IS mutations are lost while ddeterious mutations can accunmlate. Adaptation and evolution in 
this way proceeds slowly as compared to sexual evolution. 

As shown in Fig. 17, asexual evolution is a slow and inefiSdent process. 
Populations move as mdividuals rather tiian as groups. A diverse population is generated by 
the mutagenesis of a single parent resulting in a distribution of fit and unfit individuals. In the 

20 absence of a sexual cycle, each piece of genetic infi>rmation of the surviving population 
remains in the individual mutants. Selection of the "fittest" results in many "fit" individuals 
being discarded along with the usefiil gmetic information they cany. Asexual evolution 
proceeds one genetic event at a time and is thus limited by the intrinsic value of a single 
genetic event. Sexual evolution moves more qiuddy and efiSdentiy. Mating within a 

25 population consolidates genetic information Mdthin the population and results in usefiil 
mutations being combined together. The combining of useful genetic information results m 
progeny that are much more fit than their parents. Sexual evohition thus proceeds much fiister 
by multiple genetic evmts. 

Years of plant and animal breeding has demonstrated tiie power of raxploying 

30 sexual recombination to effect the rapid evolution of complex goiomes towards a particular 
task. This general prindple is fiirther dmonstrated by using DNA shufiffing to reoombine 
DNA molecules in vitro to acceloate the rate of directed moleoilar evohition. The strain 
improvement efforts of the fermentation mdustry rely on the directed evohition of 
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microoi^anisins by sequential random mutagenesis. Incorporation of recombination into ^is 
iterative process greatly accelerates the strain improvement process, which in turn increases 
the profitability of current fermentation processes and facilitates the development of new 
products. 

5 IC DNASHUFFLmOVS NATURAL RECOMBINATION- THE UTILrTY 

OF POOLWISE RECOMBINATION. 

DNA shuffling includes the recursive recombination of DNA sequences. A 
significant difference between DNA shuffling and natural sexual recombmation is that DNA 
shuffling can produce DNA sequences originating from multiple parental sequences while 
10 sexual recombination produces DNA sequences originating fi-om only two parental sequences 
(Fig. 25). 

As shown in figure 25, the rate of evolution is in part limited by the number of 
useful mutations that a member of a population can accumulate between selection events. In 
sequential random mutagenesis, useful mutations are accumulated one per selection event. 

15 Many usefU mutations are discarded each cycle in favor of the best performer, and neutral or 
deleterious mutations which survive are as difficult to lose as they were to gain and thus 
accumulate. In sexual evolution pairwise recombination allows mutations from two different 
parents to segregate and recombine in diffei-ent combinations. Useful mutations can 
accumulate and deleterious mutations can be lost. Poolwsie recombination, such as that 

20 eflfected by DNA shuffling, has the same advantages as pairwise recombmation but allows 
mutations firom many parents to consolidate into a single progeny. Thus poolwise 
recombination provides a means for increasing the number of useful mutations that can 
accumulate each selection event. The graph in Fig. 25 shows a plot of the potential number of 
mutations an individual can accumulate by each of these processes. Recombination is 

25 exponmtialiy superior to sequential random mutagenesis, and this advantage increases 

ffiponentially with the number of parents that can recombine. Sexual recombination is thus 
more conservative. In nature, the pairwise nature of sexual recombination may provide 
important stability within a population by impeding the large changes m DNA sequoice that 
can result from poohvise recombination. For the puiposes of directed evolution, however, 

30 poohvise recombination is more eEEcient. 

The potential diversity that can be generated fit)m a population is greater as a 
result of poolwise recombination as compared to that resulting £rom pairwise recombinatioa 
Further, poolwise recombination enables the combining of multiple bmefidal nmtations 
originating from multiple parental sequences. 
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To demonstrate the importance of poolwise recombination vs pairwise 
recombination in the generation of molecular diversity consider the breeding of ten 
independent DNA sequences each containing only one unique mutation. There are 2*° = 1024 
different combinations of those ten mutations ranging from a single sequence having no 
mutations (the consensus) to that having ail ten mutations. If this pool were recombined 
together by pairwise recombination, a population containing the consensus, the parents, and 
the 45 different combinations of any two of the mutations would result in 56 or ca. 5% of the 
possible 1024 mutant combinations. Alternatively, tf the pool were recombined together in a 
poolwise fashion, all 1024 would be theoretically generated, resulting in an approximately 20 
fold increase in libiaiy diversty . When looldng for a unique solution to a problem in 
molecular evolution, the more complex the library, the more complex the possible solution. 
Indeed, the most fit member of a shufSed library often contains several mutations originating 
from several independent starting sequences. 

1 . DNA Shuffli ng Provides Recursive Pairwise Recombination 
In vitro DNA shuffling results in the efficient production of combinatorial 

genetic libraries by catalyzing the recombination of multiple DNA sequences. While the result 

of DNA shufiBuig is a population represmting the poolwise recombination of multiple 

sequences^ the process does not rely on tiie recombination of multiple DNA sequences 

^multaneously, but rather on thdr recursive pairwise recombination. The assembly of 

complete genes from a mixed pool of small gene fragments requires multiple armeaSng and 

dongadon qrcles, the thermal cydes of the primerless PGR reaction. During each thermal 

cyde many pairs of fragments anneal and are extended to fr>nn a combinatorial population of 

larger chimeric DNA fifagmmts. After the first cycle of reassembly, chimeric fragments 

contain sequence originating from predominantiy two different parent genes, with all possible 

pairs of ""parental" sequence tiieoretically represented. This is similar to the result of a single 

sexual cyde within a populatiort During the second cyde, these diimaic fiagments anneal 

with each other or with other small fragments, resulting in chimeras origmating firom up to 

four of the different starting sequences, again with all possible combinations of the four 

parental sequences theoretically represented. This second cycle is analogous to tiie entire 

population resulting from a single sexual cross, both parents and ofiEspring, inbreeding. 

Further cycles result in chimeras originating from 8, 16, 32, etc parental 

sequences and are analogous to fiirther inbreedings of the preceding population. This could be 

considered similar to the diversity generated from a small population of birds that are isolated 
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on an island, breeding with each other for many generations. The result mimics the outcome 
of "poolwise" recombination, but the path is via recursive pairwise recombination. For this 
reason, the DNA molecules generated from in vitro DNA shuffling are not the **progeny" of 
the starting **parental" sequences, but rather the great, great ,great, greatn,.. (n = number of 
5 thermal cycles) grand progeny of the starting "ancestor" molecules. 

L. FERMENTATION 

The fermentation of microorganisms for the production of natural products is 
the oldest and most sophisticated application of biocatalysis. Industrial microorganisms efifect 
the multistep conversion of renewable feedstocks to high value chemical products in a single 

10 reactor and in so doing catalyze a multi-billion dollar industry. Fermentation products range 
from fine and commodity chemicals such as ethanol, lactic acid, amino acids and vitamins, to 
high value small molecule pharmaceuticals, protein pharmaceuticals, and industrial enzymes. 
See, e.g., McCoy (1998) C&EN 13-19) for an introduction to biocatalysis. 

Success in bringing these products to market and success in competing in the 

1 5 market depends on continuous improvement of the whole cell biocatalysts. Improvements 
include increased yield of desired products, removal of unwanted co-metabolites, improved 
utilization of inexpensive carbon and nitrogen sources, and adaptation to fermenter conditions, 
increased production of a primary metabolite, increased {Production of a secondary metabolite, 
increased tolerance to acidic conditions, increased tolerance to basic conditions, ino'eased 

20 tolerance to organic solvents, increased tolerance to high salt conditions and increased 

tolerance to high or low temperatures. Shortcomings in any of these areas can result in high 
manufacturing costs, inability to capture or maintain market share, and Mure of bringing 
promising products to market. For this reason, the fermentation industry invests significant 
financial and personnel resources in the improvement of production strains. 

25 Current strategies for strain improvement rely on the empirical and iterative 

modification of fermenter conditions and genetic manipulation of the producing organisnt 
Wtule advances in the molecular biology of established industrial organisms have been made, 
rational metabolic eng^e^g is information intensive and is not broadly applicable to less 
characterized industrial strains. The most ^ddy practiced strategy for strain improvement 

30 employs random mutagenesis of the produdng strain and screening for mutants having 
improved properties. For mature strains, those subjected to many rounds of improvement, 
these efforts routinely provide a 10% increase in product titre per year. Although effective, 
this classic strategy is slow, laborious, and expensive. Technological advances in this area are 
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aimed at automation and increasing sample screening throughput in hopes of reducing the^ cost 
of strain improvement. However, the real technical barrier resides in the intrinsic limitation of 
single mutations to effect significant strain improvement. The methods herem overcome this 
limitation and provide access to multiple useful mutations per cycle which can be used to 
S complement automation technologies and catalyze strain improvement processes. 

The methods herein allow biocatalysts to be improved at a faster pace than 
conventional methods. Whole genome shufiOing can at least double the rate of strain 
improvement for microorganisms used in fermentation as compared to traditional methods. 
This provides for a relative decrease in the cost of fermentation processes. New products can 
10 enter the market sooner, producers can increase profits as well as market share, and 
consumers gain access to more products of higher quality and at lower prices. Further, 
increased efficiency of production processes translates to less waste production and more 
frugal use of resources. Whole genome shuffling provides a means of accumulating multiple 
usefiil mutation per (^cle and thus eliminate the inherent limitation of currrat strain 
1 S improvement programs (SIPs). 

DNA shiifSing provides recur^ve mutagenesis, recombination, and selection of 
DNA sequences. A key difference between DNA shuffling-mediated recombination and 
natural sexual recombination is that DNA shufiQing effects both the pairwise (two parents) and 
the poolwise (multiple parents) recombination of parent molecules, as described supra, 
20 Natural recombination is more conservative and is limited to pairwise recombination. In 

nature, pairwise recombination provides stability within a population by preventing large leaps 
in sequences or genomic structure that can result from poohvise recombination. However, for 
the purposes of directed evolution, poolvdse recombination is appealing since the b^efidal 
mutations of nmltiple parents can be combined during a single cross to produce a superior 
25 ofi&pring. Poolwise recombination is analogous to the crossbreedmgofinbred strains in 
classic strain improvemmt, except that the crosses occur between mai^ strains at once. In 
essence, poolwise recombination is a sequence of events that effects the recombination of a 
population of nucleic add sequences that results in the gena:ation of new nucleic adds that 
contains genetic information from more than two of the original nucleic adds. The power of m 
30 vitro DNA shu£3ing is that large combinatorial libraries can be generated from a small pool of 
DNA fragments reassembled by recursive pairwise annealing and extension reactions, 
^'matings.*' Many of the m vivo recombination formats described (such as plasmid-plasmid, 
plasmid-chromosome, phage-phage, phage-diromosome, phage-plasmid, coqugal DNA- 
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chromosome, exogenous DNA-chromosome, chromosome-chromosome, with the DNA being 
introduced into the cell by natural and non-natural competence, transduction, transfection, 
conjugation, protoplast fusion, etc.) result primarily in the pairwise recombination of two 
DNA molecules. Thus, these formats when executed for only a single cycle of recombination 
5 are inherently limited m their potential to generate molecular diversity. To generate the level 
of diversity obtained by in vitro DNA shufiling methods, pairwise mating formats must be 
carried out recursively, i.e for many generations, prior to screening for improved sequences. 
Thus a pool of DNA sequences, such as four independent chromosomes, must be recombined, 
for example by protoplast &sion, and the progeny of that recombination (each representing a 
10 unique outcome of the pairwise mating) mtist then be pooled, without selection, and then 
recombined again, and again, and again. This process should be repeated for a sufficient 
number of cycles to result in progeny having the desired complexity. Only once sufficient 
diversity has been generated, should the resulting population be screened for new and 
nnproved sequences. 

1 S There are a few general methods for effecting efficient recombination in 

prokaryotes. Baaeria have no known sexual c^cle per se, but there are natural mechanisms by 
vMch the genomes of these organisms undergo recombination. These mechanisms include 
natural competence, phage-mediated transduction, and cell-cell conjugation. Bacteria that are 
natun% competent are capable of efficiently taking up naked DNA from the environment. If 

20 homologous, this DNA undergoes recombination with the genome of the cell, resulting in 
genetic exchange. Bacillus subtilis^ the primary production oiganism of the enzyme industry, 
is known for the efficiency with which it carries out this process. 

In gma:alizedtnmsduction, a bacteriophage mediates genetic exchange. A 
transducing phage will often package headfiiUs of the host genome. These phage can infect a 

25 new host and deliv^ a fragment of the former host genome which is frequently integrated via 
homologous rea>mbination. Cdls can also tran^ DNA between themselves by conjugation. 
Cells containing the appropriate mating factors transfer episomes as well as entire 
chromosomes to an £q;)propriate acceptor cell where it can recombine with the acceptor 
genome. Coi^ugation resmibles sexual recombination for microbes and can be intraspedfic, 

30 interspecific^ and inteigeneric. For example, an efficient means of transforming Streptomyces 
sp., a genera responsible for producing many connmercial antibiotics, is by the conjugal 
transfer of plasmids from Echerichia coli. 
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For many industrial microorganisms, knowledge of competence, transducing 
phage, or fertility factors is lacking. Protoplast fusion has been developed as a versatile and 
general alternative to these natural methods of recombmation. Protoplasts are prepared by 
removing the cell wall by treating cells with lytic enzymes in the presence of osmotic 
S stabilizers. In the presence of a fusogenic agent, such as polyethylene glycol (PEG), 

protoplasts are induced to fuse and form transient hybrids or **fusants." During this hybrid 
state, genetic recombination occurs at high frequency allowing the genomes to reassort. The 
final crucial step is the successfiil segregation and regeneration of viable cells from the fused 
protoplasts. Protoplast fusion can be intraspecific, interspecific, and intergeneric and has been 

10 applied to both prokaiyotes and eukaryotes. In addition, it is possible to fiise more than two 
cells, thus providing a mechanism for effecting poolwise recombination. While no fertility 
factors, transducing phages or competency development is needed for protoplast fusion, a 
method for the formation, fusing, and regeneration of protoplasts is typically optimized for 
each organism. Protoplast fusion as appUed to poolwise recombination is described in more 

IS detail, in//7ra. 

One key to SIP is having an assay that can be dependably used to idratify a few 
mutants out of thousands that have subtle increases in product yield. The limiting &ctor in 
many assay formats is the uniformity of cell growth. This variation is the source of baseline 
variability in subsequent assays. Inoculum aze and culture environment 

20 (temperature/humidity) are sources of cell growth variation. Automation of all aspects of 

establishing initial cultures and state-of-the-art temperature and humidity controlled incubators 
are usefiil in redudng variability. 

Mutant cells or spores are separated on solid media to produce individual 
sporulating colonies. Using an automated coloi^ pidcer (Q-bot, Genetix, U.K.), colonies are 

25 identified, picked, and 10,000 different mutants inoculated into 96 well microtitre dishes 
containing two 3 nmi glass balls/well. The Q-bot does not pick an entire colony but rather 
inserts a pin througfh the c^ter of the colony and e?dts with a small sampling of cells (or 
n^celia) and spores. The time the pin is in the colony, the number of dips to inoculate the 
oilture medhim, and the time the pin is in that medium each efiSxt inoculum size, and each can 

30 be controlled and optimized. The uniform process of the Q-bot decreases human handling 
error and increases the rate of establishing cultures (roughly 10,000/4 hours). These cultures 
are then shakra in a ten^mture and humidity controlled incubator. The glass balls act to 
promote uniform action of cdls and the dispersal of mycelial fiagments similar to the blades 
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of a fennenter. An embodiment of this procedure is further illustrated in Fig. 28, including an 
integrated system for the assay. 

1. Prescreen 

The ability to detect a subtle increase in the performance of a mutant over that 

5 of a parent strain relies on the sensitivity of the assay. The chance of finding the organisms 
having an improvement is increased the number of individual mutants that can be screened 
by the assay. To increase the diances of identifying a pool of sufficient size a prescreen that 
increases the number of mutants processed by 10-fold can be used. The goal of the primary 
screen will be to quickly identify mutants having equal or better product titres than the parent 

10 strain(s) and to move only these mutants forward to liquid cell culture. 

The primary screen is an agar plate screen is analyzed by the Q-bot colony 
picker. Although assays can be fundamentally different, many result, e.g., in the production of 
colony halos. For example, antibiotic production is assayed on plates using an overiay of a 
sensitive indicator strain, such as B. subtilis. Antibiotic production is typically assayed as a 

15 zone of clearing (inhibited growth of the indicator organism) around the producing organism. 
Similariy, enzyme production can be assayed on plates containing the enzyme substrate, with 
activity being detected as a zone of substrate modification around the produdng colony. 
Product titre is correlated \^dth the ratio of halo area to colony area. 

The Q-bot or other automated system is instructed to only pick colonies having 

20 a halo ratio in the top 10% of the population i.e. 10,000 mutants from the 100,000 entering 
the plate prescreen. This increases the number of improved clones in the secondary assay and 
eliminates the wasted effort of scre^iing knock-out and low producers. This improves the "hit 
rate" of the secondary assay. 

M. PROMOTION OF GENETIC EXCHANGE 

25 1. General 

Some methods of the invention effect recombination of cellular DNA by 

propagating cells under conditions inducing exchange of DNA between cells. DNA exchange 

can be promoted by goierally ^plicable methods such as electroporation, biolistics, cell 

fiiaon, or in some instances, by conjugation, transduction, or agrobacterium mediated transfer 

30 and mdosis. For example, Agrobacterium can transform S. cerevisiae with T-DNA, vAndx is 

incorporated into the yeast genome by both homologous recombination and a gap repair 

mechanism. (Piers et al., /Voc. NaiL Acad ScL USA 93(4), 1613-8 (1996)). 
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In some methods, initial diversity between cells (i.e., before genome exch^e) 
is induced by chemical or radiation-induced mutagenesis of a progenitor cell type, optionally 
followed by screening for a desired phenotype. In other methods, diversity is natural as where 
cells are obtained from different individuals, strains or species. 

In some shuflBing methods, induced exchange of DNA is used as the sole means 
of effecting recombmation in each cycle of recombination. In other methods, induced 
exchange is used in combination with natural sexual recombination of an organism. In other 
methods, induced exchange and/or natural sexual recombination are used in combination with 
the introduction of a fragment library. Such a fragment library can be a whole genome, a 
whole chromosome, a group of functionally or genetically linked genes, a plasmid, a cosmid, a 
mitochondrial genome, a viral genome (replicative and nonrq}licative) or spedfic or random 
fragments of any of these. The DNA can be Unked to a vector or can be in free form. Some 
vectors contain sequences promoting homologous or nonhomologous recombination with the 
host genome. Some fragments contain double stranded breaks such as caused by shearing 
with glass beads, sonication, or chenucal or enzymatic fragmentation, to stimulate 
recombination. 

In each case, DNA can be exchanged between cells after which it can undergo 
recombination to form hybrid genomes. Generally, cells are recursively subject to 
recombination to increase the div^sity of the population prior to screening. Cells bearing 
hybrid genomes, e.g., generated after at least one, and usually several cycles of recombination 
are screened for a desired phenotype, and cells having this phenotype are isolated. These cdls 
can additionally form starting mat^ials for additional cycles of recombination m a recursive 
recombination/selection sdieme. 

One means of promoting exchange of DNA between cells is by iiision of cells, 
such as by protoplast fiision. A protoplast results from the removal from a cell of its cell wall, 
leaving a mmibrane-bound cell that depends on an isotonic or hypertonic medium for 
maintaining its integrity. If the cell wall is partially ranoved, the resulting cell is strictly 
referred to as a spheroplast and if it is completely removed, as a protoplast. However, here 
the term protoplast includes spheroplasts unless otherwise indicated. 

Protoplast fiifflon is desmbed by Shaflftier ct al., Proc. Natl. Acad. Set USA 77, 
2163 (1980) and other exemplary procedures are described by Yoakum et al., US 4,608,339, 
Takahashi ^ al., US 4,677,066 and Sambrooke et al., at Ch. 16. Protoplast fiision has been 
reported between strains, species, and goiera (e.g., yeast and chicken erythrocyte). 
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Protoplasts can be prepared for both bacterial and eukaryotic cells, including 
mammalian cells and plant cells, by several means including chemical treatment to strip cell 
walls. For example, cell walls can be stripped by digestion With a cell wall degrading enzyme 
such as Iyso2yme in a 10-20% sucrose, 50 mM EDTA bufifer. Conversion of cells to spherical 
protoplasts can be monitored by phase^contrast microscopy. Protoplasts can also be prepared 
by propagation of cells ia media supplemented with an inhibitor of cell wall synthesis, or use of 
mutant strains lacking capacity for cell wall fomation. Preferably, eukaryotic cells are 
synchronized in Gl phase by arrest with inhibitors such as a-factor, n laciis killer toxin, 
leflonamide and adenylate cgrclase inhibitors. Optionally, some but not all, protoplasts to be 
fused can be killed and/or have their DNA fragmented by treatment with xiltraviolet irradiation, 
hydroxylamine or cupferon (Reeves et al., FEMS Microbiol Lett. 99, 193-198 (1992)). In 
this situation, killed protoplasts are referred to as donors^ and viable protoplasts as acceptors. 
Using dead donors cells can be advantageous in subsequently recognizing fused cells with 
hybrid genomes, as described below. Further, breaking up DNA in donor cells is 
advantageous for stimulating recombinadon with acceptor DNA. Optionally, acceptor and/or 
fiised cells can also be briefly, but nonlethally, exposed to UV irradiation further to stimulate 
recombination. 

Once formed, protoplasts can be stabilized in a variety of osmolytes and 
compounds such as sodhim chloride, potassmm chloride, sodium phosphate, potassium 
phosphate, sucrose^ sorbitol in the presaice of DTT. The combination of buffer, pH, 
reducing agent, and osmotic stabilize can be optimized for different cell types. Protoplasts 
can be induced to fiise by treatment with a chemical such as PEG, caldum chloride or caldum 
propionate or electrofiiaon CTsoneva, AciaMicrobiologica Bulgtxria 24; 53-59 (1989)). A 
method of cell fiiston employing electric fields has also been desoibed. See Chang US, 
4,970,154. Conations can be optimized for different strains. 

The fused cells are heterokaiyons conUumng genomes from two or more 
component protoplasts. Fused cells can be enriched fit)m unfUsed parental cells by sucrose 
gradient sedimentation or cell sorting. The two nuclei in the heterokaryons can fiise 
(kaiyogamy) and homologous recombination can occur between the genomes. The 
chromosomes can also segregate asymmetrically resulting in regenerated protoplasts that have 
lost or gained whole chromosomes. The frequency of recombination can be increased by 
treatment with ultraviolet irradiation or by use of strains overexpressing recA or other 
reoombihation genes, or the yeast tad gaies, and cognate variants th^eof in oth^ species^ or 
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by the inhibition of gene products of A^/S, MuiL^ or MuiD. Overexpression can be either the 
result of introduction of exogenous recombination genes or the result of selecting strains, 
which as a result of natural variation or induced mutation, overexpress endogenous 
recombination genes. The fiised protoplasts are propagated under conditions allowing 
5 regeneration of cell walls, recombination and segregation of recombinant genomes mto 
progeny cells from the heterokaiyon and expression of recombinant genes. This process can 
be reiteratively repeated to increase the diversity of any set of protoplasts or cells. After, or 
occasionally before or during, recovery of fused cells, the cells are screwed or selected for 
evolution toward a desired property. 
10 Thereafter a subsequent round of recombination can be performed by preparing 

protoplasts from the cells surviving selection/screening in a previous round. The protoplasts 
are fused, recombination occurs in fiised protoplasts, and cells are regenerated from the ftised 
protoplasts. This process can again be reiteratively repeated to increase the diversity of the 
starting population. Protoplasts, regenerated or regenerating cells are subject to fiuther 
1 S selection or screening. 

Subsequent rounds of recombination can be performed on a spUt pool basis as 
described above. That is, a first subpopulation of cells surviving selection/screening from a 
pre^dous round are used for protoplast fonnatioiL A second subpopulation of cells sundving 
selection/screening fix>m a previous roimd are used as a source for DNA library preparation. 
20 The DNA libraiy from the second subpopulation of cells is then transformed into the 
protoplasts from the first subpopulation. The library undergoes recombination mth the 
genomes of the protoplasts to form recombinant genomes. This process can be repeated 
several times in the absence of a selection event to increase the diversity of the cell population. 
Cells are r^enerated fit>m protoplasts, and sdection/so-eening is q>plied to regenerating or 
25 r^enerated cells. In a fiuther variation, a fi^h library of imcleic acid fi^gments is introduced 
into protoplasts surviving selection/screening firom a previous round. 

An exemplary format for shu£9ing using protoplast fiision is shown in Fig. S. 
The figure shows the following steps: protoplast formation of donor and redpient strains, 
heterokaryon formation, karyogamy, recombination, and segregation of recombinant genomes 
30 into separate cdls. Optionally, the recombinant genomes, if haviiig a sexual cyde, can 

undergo fiirther recombination with each other as a result of mdo^ and matiiig. Recurve 
cydes of protoplast fii^on, or recursive mating/meiosts is oftoi used to incraise the divmity 
of a cell population. After achieving a suffidently diverse population via one of these forms of 
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recombination, cells are screened or selected for a desired property. Cells surviving 
selection/screening can then used as the starting materials in a further cycle of protoplasting or 
other recombination methods as noted herem. 

2, Selection For Hybrid Strains 
5 The invention provides selection strategies to identify cells formed by fusion of 

components from parental cells from two or more distinct subpopulations. Selection for 

hybrid cells is usually performed before selecting or screening for cells that have evolved (as a 

result of genetic exchange) to acquisition of a desired property. A basic premise of most such 

selection schemes is that two initial subpopulations have two distinct markers. Cells with 

10 hybrid genomes can thus be identified by selection for both markers. 

In one such scheme, at least one subpopulation of cells bears a selective marker 
attached to its cell membrane. Ejcamples of suitable membrane markers indude biotin, 
fluorescdn and rhodamine. The markers can be linked to amide or thiol groups or through 
more specific derivatization chemistries, such as iodo-acetates, iodoacetamides, maleimides. 

15 For example, a marker can be attached as follows. Cells or protoplasts are washed with a 
buffer (e.g., PBS), which does not interfere with the chemical coupling of a chemically active 
ligand which reacts with amino groups of lyanes or N-tmninal aminogroups of membrane 
protdns. The ligand is dther amine reactive itself (e.g., isothioc^anates, sucdnimidyl esters, 
sulfonyi chlorides) or is activated by a het^obifimctional linker (e.g. EMCS, SIAB, SPDP, 

20 SMB) to become amine reactive. The ligand is a molecule vAiich is easily bound by protdn 
derivatized magnetic beads or other capturing solid supports. For example, the ligand can be 
sucdnimidyl activated biotin (Molecular Probes Inc.: B-1606, B-2603, S-1S15, S-1S82). This 
linker is reacted with anunogroups of proteins residing in and on the surface of a cell. The 
cdls are then washed to remove excess labelling agent before contacting with cells firom the 

25 second subpopulation bearing a second selective marker. 

The second subpopulation of cells can also bear a membrane marker, albeit a 
differ^ membrane marker from the first subpopulation. Ahernativdy, the second 
subpopulation can bear a genetic maricer. The genetic marker can conf^ a sdective property 
such as drug resistance or a screenable property, such as expression of grera fluorescent 

30 protein. 

After fiiaon of first and second subpopulations of cdls and recovery, cells are 
screwed or sdected for the presence of markers on both parental subpopulations. For 
example, fusants are enridied for one population by adsorbtion to specific beads and these are 
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then sorted by FACS for those expressing a marker. Cells surviving both screens for both. 

markers are those having undergone protoplast fusion, and are therefore more likely to have 
recombined genomes. Usually, the markers are screened or selected separately. Membrane- 
bound markers, such as biotin, can be screened by affinity enrichment for the cell membrane 
5 raaiker (e.g., by panning fused cells on an affinity niatrix). For example, for a biotin 
membrane label, cells can be affinity purified using streptavidin-coated magnetic bead$ 
(Dynal). These beads are washed several times to remove the non-fiised host cells. 
Alternatively, cells can be panned against an antibody to the membrane marker. In a fiirther 
variation, if the membrane marker is fluorescent, cells bearing the marker can be identified by 

10 FACS. Screens for genetic markers depend on the nature of the markers, and uiclude capacity 
to grow on drug-treated media or FACS selection for green fluorescent protein. If first and 
second cell populations have fluorescent markers of difierent wavelengths, both maik^s can 
be screened simultaneously by FACS sorting. 

In a fuitfaer selection sdieme for hybrid ceUs, first and second populations of 

IS cells to be fiised express different subunits of a heteromultimeric enzyme. Usually, the 
heteromultimeric enzyme has two difi^ent subunits, but heteromultimeric enzymes having 
fhree,four or more different subunits can be used. Ifanenz^e has more than two dififerent 
subunits, each subunit can be expressed in a different subpopulation of cells (e.g., three 
subunits in three subpopulations), or more than one subunit can be expressed in the same 

20 subpopulation of cells (e.g., one subunit in one subpopulation, two subunits in a second 

subpopulation). In the case where more than two subunits are used, selection for the poolwise 
recombination of more than two protoplasts can be achieved. 

Hybrid cells representing a combination of genomes of first, second or more 
subpopulation component cdls can thm be recognized by an assay for intact en^me. Such an 

25 assay can be a binding assay, but is more typically a functional assay (e.g., capadty to 
metabolize a substrate of the eazyme). Enzymatic activity can be detected for example by 
processing of a substrate to a product with a fluorescent or otherwise easily detectable 
absoibance or miission spectrum. The individual subunits of a hetOT)multimeric enzyme used 
in such an assay preferably have no ^izymic activity in dissociated form, or at least have 

30 dgnificantly less actwity in dissociated form than associated form. Preferably, the cells used 
for fusion lack an endogenous form of the heteromultimeric enzyme, or at least have 
significantly less endogenous activity than results fi'om heteromultimeric enzyme formed by 
fiision of cells. 
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Penicill i n ac y lase en z ymes, ceph atespori n acy l ase and p enicillin acyltr a nsfe^ase — 

are examples of suitable heteromultimeric enzymes. These enzymes are encoded by a single 
gene, which is translated as a proenzyme and cleaved by posttranslational autocatalytic 
proteolysis to remove a spacer endopeptide and generate two subunits, which associate to 
5 form the active heterodimeric enzyme; Neither subunit is active in the absence of the other 
subunit. However, activity can be reconstituted if these separated gene portions are expressed 
in the same cell by co-transformation. Other enzymes that can be used have subunits that are 
encoded by distinct genes (e.g., faoA and faoB genes encode 3-oxoacyl-CoA thiolase of 
PseudonmonasJragi(Biochem, 7328, 815-820 (1997)). 

10 An exemplary enzyme is penicillin G aqrlase fh)m Escherichia coli, which has 

two subunits encoded by a single gene. Fragments of the gene encoding the two subunits 
operably linked to appropriate expression regulation sequences are transfected into first and 
second subpopulations of cells, which lack endogenous penicillin acylase activity. A cell 
formed by fiision of component cells fi*om the first and second subpopulations expresses the 

1 5 two subunits, whidi assemble to form fiinctional enzyme, e.g., penidllin a<^lase. Fused cells 
can then be selected on agar plates containing penidllin G, which is degraded by penicillin 
acylase. 

In another variation, fiised cells are identified by conoplementation of 
auxotrophic mutants. Parental subpopulations of cells can be sdected for known auxotrophic 

20 mutations. Alternatively, auxotrophic mutations in a starting populatioti of cells can be 

generated spontaneously by exposure to a mutagenic agent. Cells with auxotrophic mutations 
are selected by replica plating on minfanal and complete media. Lemons resulting in 
auxotrophy are e7q)ected to be scattefed throughout the genome, in gmes fi^r amino acid, 
micleotide, and vitamin biosynthetic pathway s. After fii^on of parental cells, cells resulting 

25 fix>m fiision can be identified by their capadty to grow on minimal media. These cells can then 
be saeened or selected tor evolution toward a desired property. Further steps of mutagenesis 
gen^ating fi^esh auxotrophic mutations can be incorporated in subsequmt cycles of 
recombination and screening/selection. 

In variations of the above method, de novo generation of auxotrophic 

30 mutations in each round of sbufiBing can be avoided by reusing the same auxotrophs. For 
example, auxotrophs can be generated by transposon mutagraesis using a transposon bearing 
selective marker. Auxotrophs are identified by a screen such as replica plating. Auxotrophs 
are pooled, and a generalized transducing phage lysate is prepared by growth of phage on a 
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popu la t io n of a uxotrophi c c ells. A separate population of auxtrophic c e lls is subj e cted to^ 

genetic exchange, and complementation is used to selected cells that have undergone genetic 
exchange and recombination. These cells are then screened or selected for acquisition of a 
desired property. Cells surviving screening or selection then have auxotrophic markers 
5 regenerated by introduction of the transducing transposon library. The newly generated 
auxotrophic cells can then be subject to fiirther genetic exchange and screening/selection. 

In a further variation, auxotrophic mutations are generated by homologous 
recombination with a targeting vector comprising a selective maricer flanked by regions of 
homology with a biosynthetic region of the genome of cells to be evolved. Recombination 

1 0 betwe» the vector and the genome inserts the positive selection marker into the genome 
causing an auxotrophic mutation. The vector is in linear form before introduction of ceQs. 
Optionally, the frequency of introduction of the vector can be increased by capping its ends 
with self-complementaixty oligonucleotides annealed in a hair pin formation. Genetic 
^change and screoiing/selection proceed as described above. In each round, targeting 

1 S vectors are reintroduced regenerating the same population of auxotrophic markers. 

In another variation, fiised cells are identified by screening for a genomic 
marker present on one subpopuladon of parental ceUs and an episomal marker present on a 
second subpopuladon of cells. For example, a first subpopulation of yeast contamiiig 
mitochondria can be used to complement a second subpopulation of yeast having a petite 

20 phenotype (i.e., lackmg mitochondria). 

In a fiirther variation, genetic exchange is performed betwem two 
subpopulations of cells» one of winch is dead. Cells are preferably IdUed by brief exposure to 
DNA fi^gmmting agents such as hydroxylamine, cupferon, or irradiation. Viable cells are 
then screened for a marker present on the dead parental subpopulation. 

25 3 . Liposome-mediated transfers 

In the methods rioted ^ve, in i^ch nuddc add fragment libraries are 

introduced into protoplasts, the nucleic adds are sometimes encapsulated in liposomes to 

fiunlitate uptake by protoplasts. Lipsome-mediated uptake of DNA by protoplasts is described 

m Redfordetal.,Afo/. Gen. Gener. 184, 567-569(1981). Liposomes can effidoitly deliver 

30 large volumes of DNA to protoplasts (see Deshayes et al., EMBO J. 4, 2731-2737 (1985)). 

See also, Philippot and Schuber (eds) (1995) Liposomes as Tools in Basic Research and 

Industry CRC press, Boca Raton, e.g.. Chapter 9, Remy et al **Gene Transfer with Cationic 

Amphipfailes.*' Further, the DNA can be deUvered as linear fragments, which are often more 
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r ec ombi n og e n i c th a t v ^^ole^nome s .- I n s om e m et ho d s , AaffliCTts-are-mirtat ed prior to ^ 

encapsulation in liposomes. In some methods, fragments are combined with RecA and 
homologs, or nucleases (e.g., restriction endonucleases) before encapsulation m liposomes to 
promote recombfaiation. Alternatively, protoplasts can be treated wth lethal doses of nicking 
reagents and then fiised. Cells which survive are those which are repaired by recombination 
with other genomic fragments, ther^y providing a selection mechansim to select for 
recombinant (and therefore desirably diverse) protoplasts. 

4^ <^h^iffli np filamentous fimp 
Filamentous fimgi are particulariy suited to performing the shuflfling methods 

described above. Filamentous fimgi are divided into four main classifications based on their 

structures for sexual reproduction: Phycomycetes, Ascomycetes^ Basidiomycetes and the 

Fmgilmperfecii. Phycomycetes (e.g., Rhizopus, Mucor) fonn sexual spores in sporan^um. 

The spores can be uni or multinucleate and often lack septated hyphae (coenocytic). 

Ascomyceies (e.g., Aspergillus^ Neurospora, PeniciUum) produce sexual spores in an ascus as 

a result of mdotic di^sion. Asci typically contain 4 meiotic products, but some contain 8 as a 

result of additional mitotic division. Basi^omycetes include mushrooms, and snnits and form 

sexual spores on the surface of a basidium. In holobasicHamycetes^ such as mushrooms, the 

basidium is luidivided. In hemibasidiomycetes, such as ruts (JJredimles) and smut fiingi 

(Ustilagimles\ the basidium is divided. Fungi imperfecti, which indude most human 

pathogens, have no known sexual stage. 

Fungi can reproduce by asexual, sexual or parasexual means. Asexual 
rq)roduction, invoh^es vegetative growth of mycelia, nuclear division and cell division without 
involvement of gametes and without nuclear fuaon. Cell division can occur by sporulation, 
budding or fragmentation of hyphae. 

Sexual reproduction provides a medianism for shufiffing genetic material 
between cells. A sexual reproductive cycle is characterized by an alteration of a haploid phase 
and a diploid phase. Diploidy occurs vAiea two haploid gamete nucld fuse (kaiyogamy). The 
gamete nuclei can come fiom the same parental strains (self-fertile), such as in the homothallic 
fimgi. In heterothallic fimgi, the parental strains come fix>m strains of different mating type. 

A diploid cell converts to haploidy \da meiosis, vMch essentially consists of 
two divisions of the nucleus acconqianied by one division of the chromosomes. The products 
of one meiosis are a tetrad (4 haploid nuclei). In some cases, a mitotic division occurs after 
meiosis, giving rise to eight product ceils. The arrangment of the resultant cells (usually 
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e nclose d in s pores) resemble s that of the p arental Strains. Th e length of th e haploid and 

diploid stages diflfers in various fiingi: for example, the Basidiomycetes and many of the 
Ascomycetes have a mostly hapolid life cycle (that is, meiosis occurs immediately after 
karyogamy), v^ereas others (e.g., Saccharomyces cerevisiae) are diploid for most of then- life 
5 cycle (karyogamy occurs soon after meiosis). Sexual reproduction can occur between cells in 
the same strain (selfing) or between cells from different strains (outcrossing). 

Sexual dimorphism (dioecism) is the separate production of male and female 
organs on different mycelia. This is a rare phenomenon among the fimgi, although a few 
examples are known. Heterothallism (one locus-two alleles) allows for outcrossing between 

10 crosscompatable strains which are self-incompatable. The simplest fonn is the two allele-one 
locus system of mating types/factors, illustrated by the following organisms: 
A and a in Neuraspora; a and a in Saccharomyces; plus and minus in Schizzosaccharomyces 
and Zygomycetes; a\ and ai in Ustilago, 

Multiple-allelomorph heterothallism is ^diibited by some of the higher 

IS Basidiomycetes (e.g. Gasteromycetes and Hymenomycetes\ which are heterothallic and have 
several mating types detemuned by multiple alleles. Heterothallism in these organisms is either 
bipolar with one mating type factor, or tetrapolar with two unlinked factors, A and B. Stable, 
fertile hetm>kaiyon formation dq)ends oh the presence of different A fttctors and, m the case 
of tetrapolar organisms, of different B fitctors as well. This system is effective in the 

20 promotion of outbreeding and the prevention of self-breeding. The number of different mating 
factors may be voy large (i.e. thousands) (Kothe, FEhS Microbiol Rev, 18, 65-87 (1996)), 
and non-paiental mating &ctors may arise by recombination 

Parasexual r^roduction provides a further means for shuffling genetic material 
between cdls. This process allows recombination of parental DNA without involvment of 

25 mating types or gametes. Parasexual fiision occurs by fayphal fiision giving rise to a common 
cytoplasm contaming different nudei. The two nuclei can divide independentiy in the resulting 
heterokaiyon but occasionally fuse. Fusion is followed by haploidization, which can involve 
loss of diromosomes and mitotic cros^ over between homolgous chromosomes. Protoplast 
fusion is a form of parasexual reproduction. 

30 Within the above four chisses, fim^ are also classified by vegetative 

compatibility group. Fungi within a vegetative compatibility group can form heterokaryons 
^tfa each other. Thus, for exchange of graetic material between diff^ent strains of fimgi, the 
fim^ are usually prepared fiom the same vegetative compatibility group. However, some 
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gen e tic exchang e can o ca ir betwe e n fi i ny fi »in4!iffefem4ncQmp ataT)iH^ group s as a resul| of 

parasexual reproduction (sec Timberiake et al., US 5,605,820). Further, as discussed 
elsewhere, the natural vegetative compatibility group of fungi can be expanded as a result of 
shuffling. 

5 Several isolates of Aspergillus nidulans. A, flavus, A. fiimigatus, Penicillium 

chrysogenum, P. notatum, Cephalosporium chrysograum, Neurospora crassa, Aureobasidium 
puUulans have been kaiyotyped. Genome sizes generally range between 20 and 50 Mb among 
the Asper^lli. DiflFerences in karyotypes often exist between similar strains and are also 
caused by transformation with exogenous DNA, FUamentous fungal genes contain introns, 

10 usually --50-100 bp in size, with similar conisensus 5' and 3* splice sequences. Promotion and 
tennination signals are often cross-recognizable, enabling the expression of a gene/pathway 
from one fungus (e.g. A. nidulans) in another (e.g. P. chrysogenum). 

The major components of the fungal cell wall are chitin (or chitosan), P-glucan, 
and mannoproteins. Chitin and P-glucan form the scaffolding, mannoproteins are interstitial 

15 components which dictate the wall's porosity, antigenicity and adhesioa Chitin synthetase 
catalyzes the polymerization of p-(l,4)-linked N-acetylglucosamine (GIcNAc) residues, 
forming linear strands running antiparallel; P-(l,3)-glucan synthetase catalyze the 
homopoiymerization of glucose. 

One general goal of shufBing is to evolve fun^ to become usefiil hosts for 

20 genetic engineering, in particdar for the shuffling of unrelated genes. A. nidulans and 
neurospora are generally the fungal organisms of choice to serve as a hosts for such 
manipulations because of their sexual ^des and well-established use in dassical and molecular 
genetics. Another general goal is to improve the capacity of fimp to make specific 
compounds (e.g. antibacterials (p^dUins, cq)halospoTins), antifungals (e.g. echinocandins, 

25 auieobasiduisX and wood*d^grading enzymes). There is some overlap between these general 
goals, and thus, some desired prqpoties are usefiil for adiieving both goals. 

One desired property is the introduction of mdotic apparatus into fimgt 
presoitly lacking a sexual cycle (see Sharon et al., Mol Gen Genet. 251, 60-68 (1996)). A 
scheme fi^r introdudng a sexual cyde into the fimg^ P. chrysogenum (a fixngus imperfecti) is 

30 shown in Eg. 6. Subpopulations of protoplasts are fi>rmed finom A, nidulans (which has a 
sexual cycle) and P. chrysogenum, y/bkh does not. The two strains preferably bear di£f&ent 
martes. The A iwdltitow protoplasts are kiUed by treatmwit with W The 
two subpopulations are fused to form heterokaiyons. In some heterokaryons, nucld fiise, and 
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so me re c ombin ati on oc cu rs . Fuse d cells a re ail ture d un der c on di t i ons to g e ner ate new-cell 

walls and then to allow sexual recombination to occur. Cells with recombinant genomes are 
then selected (e.g., by selecting for complementation of auxotrophic markers present on the 
respective parent strains). Cells with hybrid genomes are more likely to have acquffed the 
S genes necessary for a sexual cycle. Protoplasts of cells can then be crossed with killed 

protoplasts of a further population of cells known to have a sexual cycle (the same or different 
as the previous round) in the same manner, followed by selection for cells with hybrid 
. genomes. 

Another desired property is the production of a mutator strain of fungi. Such a 

10 fungus can be produced by shu£Sing a fungal strain containing a marker gene with one or more 
mutations that impair or prevent expression of a functional product. Shufflants are propagated 
under conditions that select for expression of the positive marker (while allowing a small 
amount of residual growth without expression). ShufBants growing fastest are selected to 
form the starting materials for the next round of shufiBing. 

1 5 Another desired property is to expand the host range of a fungus so it can form 

heterokaiyons with fungj from other vegetative compatibility groups. Incompalability 
between species results from the interactions of specific alleles at differait incompatability lod 
(such as the '%e/" lod). If two strains undergo hyphal anastomods, a lethal cytoplasmic 
incompatability reaction may ocoir if the strains differ at these lod. Strains must carry 

20 identical lod to be entirely compatible. Several of these loci have been identified in various 
spedes, and the incompatibility efifect is somewhat additive (hence, "partial incon[Q>atibility** 
can occur). Some tolerant and /re/-negative nmtants have been dj^oibed for these organisms 
(e.g. Dales & Croft, J. Gen. Microbiol, 136, 1717-1724 (1990)). Further, a tolerance gene 
(tol) has been reported, which suppresses mating-type heterokaiyon incompatibility. Shuffling 

25 is performed between protoplasts of strains firom diffident incompatibility groups. A preferred 
fonnat uses a live acceptor strain and a UV-irradiated dead acceptor strain. TheUV 
irradiation senres to introduce mutations into DNA inactivating het genes. The two strains 
should bear different genetic markers. Protoplasts of the strain are fiised, cells are regenerated 
and screened for complementation of maricers. Subsequent rounds of shuffling and sdection 

30 can be performed in the same manner by fusing the cells surviving screening with protoplasts 
of a fresh population of donor cells. Similar to other procedures noted herein, the cdls 
resulting from regenmition of the protoplasts are optionally refiised by protoplasting and 
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regenerated into cells one or mnre times prior tn any gel ecHon 5rtep tn incrftflsa the fiiver.Qify nf 

the resulting population of cells to be screened. 

Another desired property is the introduction of multiple-allelomorph 
heterothallism into Ascomyceies and Ftmgf imperfecti, which do not normally exhibit this 
5 property. This mating system allows outbreeding without self-breeding. Such a mating 
system can be introduced by shufiBing Ascomycetes and Fungi imperfecti with DNA from 
Gasteromycetes ox Hymenomycetes^ which have such a system. 

Another desired property is spontaneous formation of protoplasts to facilitate 
use of a fungal strain as a shufiSing host. Here, the fimgus to be evolved is typically 

10 mutagenized. Spores of the fungus to be evolved are briefly treated with a cell-wall degrading 
agent for a time insuflBdent for complete protoplast formation, and are mixed with protoplasts 
from other strain(s) of fungi. Protoplasts formed by fiision of the two different subpopulations 
are identilBed by genetic or other selection/or screening as described above. These protoplasts 
are used to regenerate mycelia and then spores, which form the starting material for the next 

IS roimd of shufiOing. In the next round, at least some of the surviving spores are treated with 
cell-wall removing enzyme but for a shorter time than the previous round. After treatment, 
the partially stripped cells are labeled with a first label. These cells are then mixed with 
protoplasts, which may derive from oth^ cells sumving selection in a previous round, or from 
afresh stram of fiingi. These protoplasts are physically labeled with a second label. After 

20 incubating the cells under conditions for protoplast fiision fiisants with both labels are selected. 
These fusants are used to generate mycdia and spores for the next round of shufBing, and so 
forth. Eventually, progeny that spontaneously form protoplasts (i.e., without ad(fition of cell 
wall degrading agmt) are identified. As with other procedures noted herein, cdls or 
protoplasts can be reiteratively fiised and r^erated prior to performing any selection step to 

25 increase the diversity of the resulting cells or protoplasts to be screened. Sunilariy, selected 
cdls or protoplasts can be retteratively fiised and regen^ed for one or several cycles i?^out 
imposing selection on the resulting cellular or protoplast populations, thereby increasing the 
(fiversily of cells or protoplasts which are eventually scremed. This process of performing 
multiple cycles of recombination interspersed with selection steps can be rdteratively repeated 

30 as desired. 

Another desired property is the acqiiisition and/or improvement of genes 
encoding enzymes in biosynthetic pathways, g^es encoding transporter proteins, and genes 
encoding proteins involved hi metabolic flux control. In this situation, genes of the pathway 
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can be introduced into the fiingus to be evolved either by genetic exchange with another ^ain 
of fungus possessing the pathway or by introduction of a fragment library from an organism 
possessing the pathway. Genetic material of these fimgi can then be subjected to further 
shuffling and screenmg/selection by the various procedures discussed in tlus application. 
S ShufiSant strains of fimgi are selected/screoied for production of the compound produced by 
the metabolic pathway or precursors thereof. 

Another desired property is increasing the stability of fungi to extreme 
condidons such as heat. In this situation, genes conferring stability can be acquired by 
exchanging DNA with or transformmg DNA from a strain that ahready has such properties. 
10 Alternatively, the strain to be evolved can be subjected to random mutagen^is. Genetic 

material of the fungus to be evolved can be sfaufiSed by any of the procedures described in this 
application, with shufflants bdng selected by surviving exposure to extreme conditions. 

Another desired property is edacity of a fungus to grow under altered 
nutritional requirements (e.g., growth on particular caibon or nitrogm sources). Altering 
15 nutritional requirements is particularly valuable, e.g., for natural isolates of fimgi that produce 
valuable conmiercial products but have esoteric and thopefore pensive nutritional 
requirement. The strain to be evolved undergoes genetic exchange and/or transformation with 
DNA from a stram that has the desired nutritional requirements. The fimgus to be evolved can 
then optionally be subjected to fiirther shuffling as described in this application and with 
20 recombinant strains being selected for capadty to grow in the desired nutritional 

circumstances. Optionally, the nutritional drcumstances can be varied in successive rounds of 
shuffling starting at close to the natural requirements of the fimgus to be evolved and in 
subsequent rounds approaching the desired nutritional requirements. 

Another desired property is acquisition of natural competence m a fungus. The 
25 procedure for acqui«tion of natural competence by shuffling is generally described in 
PCT/US97/04494. The fimgus to be evolved typically undergoes genetic exchange or 
transformation with DNA firom a bacterial strain or fimgal strain that already has this property. 
Cells with recombinant genomes are then selected by capacity to take up a plasmid bearing a 
selective maricer. Further rounds of recombination and selection can be performed using any 
30 of the procedures described above. 

Another desired property is reduced or increased se^etion of proteases and 
DNase.^ In this situation, the fungus to be evoh^ed can acquire DNA by exchange or 
transformation from another strain known to have the desired property. Alternatively, the 
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fiingus to be evolved can be subject to random mutagenesis. The fungus to be evolved is , 
shuffled as above. The presence of such enzymes, or lack thereof, can be assayed by 
contacting the culture media from individual isolates with a fluorescent molecule tethered to a 
support via a peptide or DNA linkage. Cleavage of the linkage releases detectable 
S fluorescence to the media. 

Another desired property is producing fiingi with altered transporters (e.g., 
MDR). Such altered transporters are useful, for example, in fimgi that have been evolved to 
produce new secondary metabolites, to allow entry of precursors required for synthesis of the 
new secondary metabolites into a cell, or to allow efflux of the secondary metabolite from the 

1 0 cell. Transporters can be evolved by introdiiction of a library of transporter variants into 
fungal cells and allowing the cells to recombine by sexual or parasexual recombinatioa To 
evolve a transporter with capacity to transport a precursor into the cells, cells are propagated 
in the present of precursor, and cells are then screened for production of metabolite. To 
evolve a transporter with capacity to export a metabolite, cells are propagated under 

1 5 conditions supporting production of the metabolite, and screened fbr export of metabolite to 
culture medium. 

A general method of fimgal shuffling is shown in Fig. 7. Spores from a frozen 
stock, a tyophilized stock, or fresh from an agar plate are used to inoculate suitable liquid 
medium (1). Spores are gerrninated resulting in hyphal growth (2). Mycdia ai^ harvested, 

20 and washed by filtration and/or centrifiigation. Optionally the sample is pretreated with DTT 
to enhance protoplast formation (3). Protoplasting is performed in an osmotically stabling 
medhmi (e.g., 1 m NaCl/20mM MgS04, pH 5.8) by the addition of cell wall-degrading 
enzyme (e.g., Novozyme 234) (4). Cell wall degrading enqmie is removed by repeated 
wasUng with osmotically stabilizing solution (5). Protoplasts can be separated from mycelia, 

25 debris and spores by filtration through miracloth, and density centrifiigation (6). Protoplasts 
are harvested by centrifiigation and resuspended to the appropriate conoentratioa This step 
may lead to some protoplast fiision (7). Fusion can be stimulated by addition of PEG (e.g., 
PEG 3350), and/or repeated centrifiigation and resuspension v\rath or without PEG. 
Hectrofiision can also be performed (8). Fused protoplasts can optionally be enriched from 

30 unfiised protoplasts by sucrose gradient sedimentation (or other methods of screening 

described above). Fused protoplasts can optionally be treated with ultraviolet irradiation to 
stimulate recombination (9). Protoplasts are cultured on osmotically stabilized agar plates to 
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regenerate cell walls and form mycelia (10). The mycelia are used to generate spores (1 1)^^ 
which are used as the starting material in the next round of shuffling (12). 

Selection for a desired property can be performed either on regenerated 
mycelia or spores derived therefrom. 
5 In an alternative method, protoplasts are formed by inhibition of one or more 

enzymes required for cell wall synthesis (see Fig. 8). The inhibitor should be fimgistatic rather 
than fimgicidal under the conditions of use. Examples of inhibitors include antifungal 
compounds described by (e.g., Georgopapadakou & Walsh, Antimicrob. Ag. Chemother 40, 
279-291 (1996); Lyman & Walsh, Drugs 44, 9-35 (1992)). Other examples include chitin 

10 synthase inhibitors (polyoxin or nikkomycin' compounds) and/or glucan synthase inhibitors 
(e.g. echinocandins, papulocandms, pneumocandins). Inhibitors should be applied m 
osmotically stabilized medium. Cells stripped of thdr cell walls can be fused or othowise 
enq)Ioyed as donors or hosts in genetic transformation/stndn development programs. A 
possible scheme utilizing this method reiteratively is outlined in Figure 8. 

IS In a further variation, protoplasts are prepared using strains of fungi, which are 

genetically defident or compromised in their ability to synthesize intact cell walls (see Fig. 9). 
Such mutants are generally referred to as fragile, osmotic-remecfial, or cell wall-less, and are 
obtainable fiiom strain depositories. Examples of such strains include Neuro^ra crassa os 
mutants (Selitreniukofl^ Antimicrob. Agents. Chemother. 23, 757-765 (1983)). Some such 

20 mutations are temperature-sensitive; Temperature-sensitive strains can be propagated at the 
permissive temperature for purposes of selection and amplification and at a nonpermissive 
temperature for purposes of protoplast formation and fiision. A tempoature sensitive strain 
Neurospora crassa os strain has been desaibed which propagates as protoplasts when growth 
in osmotically stabilisdng medmm containing soibose and polyoxin at nonpermis^e 

25 temperature but generates whole cells on transfer to medmm containing sorbitol at a 
permissive temp^ature. See US 4,873,196. 

Other suitable strains can be produced by targeted mutagenesis of genes 
involved in chitin synthesis, glucan synthesis and oth^ ceil wall-related processes. Examples 
of such genes include CHTl, CHrr2 and CALI (or CSD2) of Saccharomyces cerevisiae and 

30 Candida spp, (Georgopapadakou & Walsh 1996); ETGI/FKSI/CNDU CWH53/PB PI and 
homologs in S. cerevisiae, Candida albicans, Cryptococcus neoformans, Aspergillus 
fumigatus, ChvAINdvA Agrobacterium and Rhizobium, Other ©camples ar^AdA, orlB, or/C, 
MD, teE, and bimG of Aspergillus nidulans (Borgia, J. BacterioL 174, 377-389 (1992)). 
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Strains ofi4. nidulans containing OrlAl or tsel mutations lyse at restrictive temperatures^^ 
Lysis of these strains may be prevented by osmotic stabilization, and the mutations may be 
complemented by the addition of N-acetylglucosimine (GlcNac). BimGll mutations are ts for 
a type 1 protein phosphatase (germlines of strains carrying this mutation lack chitin, and 
5 condia swell and lyse). Other suitable genes are c/iyA, chsB, chsC^ chsD and chsE of 

Aspergillus fumigatus; chsl and chs2 ofNeurospora crassa ; Phycomyces blakesleeanus MM 
and chsl, 2 and 3 of 5. cerevisiae, Chsl is a non-essential repair enzyme; chs2 is involved in 
septum formation and chs3 is involved in cell wall maturation and bud ring formation. 

Other usefiil strains include S. cerevisiae CLY (cell lysis) mutants such as ts 

10 strains (Paravicini et al., MoL CellBioL 12, 4896^905 (1992)), and the CLY 15 strain which 
haitors a PKC 1 gene deletion. Other usefiil strains include strain VY 1160 containing a ts 
mutation in srb (encoding actin) (Schade et al. Acta Hisiochem. SuppL 41, 193-200 (1991)), 
and a strain with an ses mutation which results in increased sensitivity to cell-wall digesting 
enzymes isolated from snail gut (Metha & Gregory, i4p/7/. Environ, Microbiol 41, 992-999 

15 (1981)). Useful strains ofC. albicans inchide those with mutations in c/isl, cAj2, or chsZ 

(encoding chitin synthetases), such as osmotic remedial conditional lethal mutants described by 
Payton & de Tiani, Curr. Genet 17, 293-296 (1990); C vtilis mutants with increased 
sen^vity to cell-wall digesting enzymes isolated from snail gut (Metha & Gregory, 1981, 
supra)\ and N. crassa mutants as-I, os-2, 05-3, os-^, ay-J, amd <w-tf . See, Sditrennikofl^ 

20 Antimicrob. Agents Chemother. 23, 757-765 (1983). Such mutants grow and divide without a 
cell wall at but at 22*'C produce a cell wall. 

Targeted mutagenesis can be achieved by transforming cells with a positive- 
negative selection vector containing homologous regions flanking a segment to be targeted, a 
positive selection marker between the homologous regions and a negative selection marker 

25 outade the homologous regions (see Capecchi, US 5,627,059). In a variation, the negative 
sdection marker can be an antisense transcript of the positive selection mark^ (see US 
5,527,674). 

Other suitable cells can be selected by random mutagenesis or shufiSing 
procedures in combination with selection. For example^ a first subpopulation of cells are 
30 mutagenized, aUowed to recover from mutagenesis, subjected to incomplete degradation of 
cell walls and then contacted with protoplasts of a second subpopulation of cells. Hybrids 
cells bearing markers from both subpopulations are identified (as described above) and used as 
the starting mataials in a subsequent round of shufBing. This sdection scheme selects both 
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for cells with capacity for spontaneous protoplast fomation and for ceils with enhanced 
recombinogenicity. 

In a further variation, ceils having capacity for spontaneous protoplast 
formation can be crossed with cells having enhanced recombinogenicity evolved using other 
methods of the mvention. The hybrid cells are paiticularty suitable hosts for whole genome 
shuffling. 

Cells wth mutations m enzymes involved in cell wall synthesis or maintenance 
can undergo fusion sunply as a result of propagating the cells in osmotic-protected cuhure due 
to spontaneous protoplast formation. If the mutation is conditional, cells are shifted to a 
nonpermissive condition. Protoplast formation and fusion can be accelerated by addition of 
promoting agents, such as PEG or an electric field (See Philipova & Venkov, Yeast 6, 205-212 
(1990); Tsoneva etaL,F£MiyMcroWoi Lett. 51, 61-65 (1989)). 

5. Targeted Shuflfl mg— Hot Spots 

In one aspect, targeted homologous genes are cloned into specific regions of 

the genome (e.g., by homologous recombination or other targeting procedures) which are 
known to be recombination **hot spots" (i.e., regions showing elevated levels of recombination 
compared to the average level of recombination observed across an entire genome), or known 
to be proximal to such hot spots. The resulting recombinant strains are mated recursively. 
During meiotic recombmation, homologous recombinant genes recombme, thereby increasing 
the diversity of the gaies. After several cycles of recombination by recursive mating, the 
resuWiig cells are screened. 

6. ShufBinp Methods in Yeast 

Yeasts are subspecies of fungi that grow as single cells. Yeasts are used for the 

production of fermented beverages and leavening, for production of ethanol as a fuel, low 
molecular weight compounds, and for tiie heterologous production of protdns and en^es 
(see accompanying list of yeast strains and their uses). Commonly used strains of yeast 
include Saccharomyces cerevisiae, Pichia jp., Canidia sp. and Schizosaccharomyces pombe. 
Several types of vectors are available for cloning in yeast including integrative 
plasmid (Yip), yeast replicating plasraid (YRp, such as tiie 2(i circle based vectors), yeast 
episomal plasmid (YEp), yeast centromwic plasmid (YCp), or yeast artificial chromosome 
(YAC). Each vector can cany markers usefiil to select for tiie presence of the plasmid such as 
LUE2, URA3, and H1S3, or the absence of the plasmid such as URA3 (a gene tiiat is toxic to 
cells grown in the presence of 5-fiuoro orotic add. 
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Many yeasts have a sexual cycle and asexual (vegetative) cycles. The sexual 
cycle involves the recombination of the whole genome of the organism each time the cell 
passes through meiosis. For example, when diploid cells of iS cerevisiae are exposed to 
nitrogen and carbon limiting conditions, diploid cells undergo meiosis to form asci. Each 
5 ascus holds four haploid spores, two of mating type "a" and two of mating type "a." Upon 
return to rich medium, haploid spores of opposite mating type mate to form diploid cells once 
again. Asiospores of opposite mating type can mate within the ascus, or if the ascus is 
degraded, for example with zymolase, the haploid cells are liberated and can mate with spores 
from other asci. This sexual cycle provides a format to shuffle endogenous genomes of yeast 

10 and/or exogenous fragment libraries inserted into yeast vectors. This process resuks in 
swapping or accumulation of hybrid genes, and for the shuffling of homologous sequences 
shared by mating cells. 

Yeast strains having mutations in several known genes have properties useful 
for shuffling. These properties include increasing the frequency of recombination and 

1 5 increasing the frequency of spontaneous mutations within a cell. These properties can be the 
result of mutation of a coding sequence or altered expression (usually overexpression) of a 
wildtype coding sequence. The HO nuclease effects the transposition of HMLa/a and 
HMRa/a to the MAT locus resulting in mating type switching. Mutants in the gene encoding 
tins en27me do not switch thdr mating type and can be employed to force crossing between 

20 strains of defined genotype, such as ones that hart)or a library or have a desired phenotype and 
to prevent in breeding of starter strains. PMSl, MLHl, MSH2, MSH6 are involved in 
mismatch repair. Mutations in these genes all have a mutator phenotype (Chambers et at., 
MoL Cell BioL 16, 61 10-6120 (1996)). Mutations in T0P3 DNA topoisomerase have a 
6-fold enhancement of int^diromosomal homologous recombination (Bailis et al.. Molecular 

25 and Cellular Biology 12, 4988-4993 (1992)). The RAD50-57 genes confer resistance to 
radiation. RadS functions in exdsion of pyiimidine dimers. RADS2 functions in gene 
conversion. RADSO, MREll, XRS2 function in both homologous recombination and 
illegitimate recombination. H0P1,RED1 fimction in early meiotic recombination 
^ao-Draayer, Genetics 144, 71-86) Mutations in eitiio- HOPl or REDl reduce double 

30 stranded breaks at the HIS2 recombination hotspot. Strains deficient in these genes are useful 
for mdntaining stability in hyper recombinogenic constructs such as tandem expression 
libraries carried on YACs. Mutations in HPR 1 are hyperrecombinogenic. HDFl has DNA 
end binding activity and is involved in double stranded break repair and V(D)J recombinatioa 
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Strains bearing this mutation are useful for transformation with random genomic fragme^ by 
either protoplast fusion or electroporation. Kar-1 is a dominant mutation that prevents 
karyogamy. Kar-1 mutants are usefiil for the directed transfer of smgle chromosomes from a 
donor to a recipient strain. This technique has been widely used in the transfer of YACs 
5 between strains, and is also usefiil m the transfer of evolved genes/chroinosomes to other 
organisms (Marlde, YAC Protocols, (Humana Press, Totowa, NJ, 1996). HOTl is an iS 
cerevisiae recombination hotspot within the promoter and enhancer region of the rDNA 
repeat sequences. This locus induces mitotic recombination at adjacent sequences- 
presumably due to its high level transcription. Genes and/or pathways inserted under the 

10 transcriptional control of this region undergo increased mitotic recombinatioa The regions 
surrounding the arg 4 and his 4 genes are also recombination hot spots, and genes cloned in 
these regions have an increased probability of undergoing recombination during meiosis. 
Homologous genes can be cloned in these re^ons and shuffled in vivo by recuravely mating 
the recombinant strains. CDC2 encodes polymerase 5 and is necessary for mitotic gene 

1 5 conversion. Overexpression of this grae can be used in a shuffler or mutator strain. A 
temperature scaisitive mutation in CDC4 halts the cell qrcle at Gl at the restricth^e 
temperature and could be used to synchronize protoplasts for optimized fiision and subsequent 
recombinatioa 

As with filamentous fung^ the general goals of shuffling yeast include 
20 improvement in yeast as a host organism for genetic manipulation, and as a production 
apparatus for various compounds. One desired property in dther case is to improve the 
capacity of yeast to express and secrete a heterologous protein. The following example 
descnbes the use of shuffling to evolve yeast to express and secrete increased amounts of 
RNaseA. 

25 RNase A catalyzes the cleavage of the P-Oy bond of RNA specifically after 

pyrinddine nucleotides. The en^me is a basic 124 amino acid polypeptide that has 8 half 
qrstine residues, each reqmred for catalysis. YEpWL-RNase A is a vector that eflFects the 
e3q)ression and secretion of RNaseA fix)m the yeast 51 cerevisiae^ and yeast harboring this 
v^or secrete 1-2 mg of recombinant RNase A per liter of culture medium (del Cardayre et 

30 al.. Protein Engineering 8(3):26, 1-273 (1995)). This overall yield is poor for a protein 

heterologously expressed in yeast and can be improved at least 10-100 fold by shuffling. The 
expression of RNaseA is easily detected by several plate and microtitre plate assays (del 
Cardayre & Raines, Biochemistry 33, 603 1^37 1994)). Each of the described formats for 
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whole genome shuflfling can be used to shuflQe a strain of 5. cerevisiae harboring ^ 
YEpWL.KNase A, and the resulting cells can be screened for the increased secretion of RNase 
A into the medium. The new strains are cycled recursively through the shuffling fonnat, until 
sufficiently high levels of RNase A secretion is observed. The use of RNase A is particularly 
5 useful since it not only requires proper foldmg and disulfide bond formation but also proper 
^ycosylation. Thus numerous components of the expression, folding, and secretion systrais 
can be optimized. The resulting strain is also evolved for improved secretion of other 
heterologous proteins. 

Another goal of shufflmg yeast is to increase the tolerance of yeast to ethanol. 

10 Such is useful both for the commerdal production of ethanol, and for the production of more 
alcoholic beers and wines. The yeast strain to be shuffled acqmres genetic material by 
exchange or transformation with other strain(s) of yeast, which may or may not be know to 
have superior resistance to ethanol. The strwn to be evolved is shuffled and shufflants are 
selected for capacity to survive exposure to ethanol. Increasing concentrations of ethanol can 

15 be used m successive rounds of shuffling. The same principles can be used to shuffle baking 
yeasts for improved osmotolerance. 

Another desired property of shuffling yeast is capacity to grow under desired 
nutritional conditions. For example, it is useful to yeast to grow on cheap cartoon sources such 
as methanol, starch, molases, cellulose, cellobiose, or xylose depending on availabilhy. The 

20 prindples of shuffling and selection are similar to those discussed for filamentous fun^. 

Another desired propaty is capadty to produce secondary metabolites 
naturally produced by filamentous fungi or bacteria. Examples of such secondary metabolites 
are cydosporin A, taxol, and cephalosporins. The yeast to be evolved undergoes genetic 
exchange or is transformed with DNA fiom oiganism(s) that produce the secondary 

25 metabolite. For example, fungi4)rodudng taxol include Taxornyces andreame and 

Pestalotopis microspora (Stierfe et al.. Science 260, 214-216 (1993); Strobd et al., Microbiol 
142, 435-440 (1996)). DNA can also be obtained fix)m trees that naturally produce taxol, 
such as Taxus brevifolia DNA encoding one en^me in the taxol pathway, taxadiene 
synthase, which it is believed catalyzes the committed step in taxol biosynthesis and may be 

30 rate limiting m overall taxol production, has hero cloned (Wildung & Croteau, / Biol Chem. 
271, 9201-4 (1996). The DNA is then shuffled, and shufflants are screened/selected for 
production of the secondary metabolite. For example, taxol production can be monitored 
using antibodies to taxol, by mass spectroscopy or UV spectrophotometry. Alternatively, 
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production of intemediates in taxol synthesis or enzymes in the taxol synthetic pathway cep be 
monitored. Concetti & Ripani, BioL Chem. Hoppe Seyler 375, 419-23 (1994). Other 
examples of secondary metaboUtes are polyols, amino acids, polyketides, non-ribosomal 
polypeptides, ergosterol, carotenoids, terpinoids, sterols, vitamin E, and the like. 

Another desired property is to increase the flocculence of yeast to &dlitate 
separation in preparation of ethanol. Yeast can be shuffled by any of the procedures noted 
above with selection for shuffled yeast forming the largest clumps. 

7. Exempl arv procedure for veast protoplasting 
Protoplast preparation in yeast is reviewed by Morgan, mProtoplasIs 

(BirkhauserVerlag, Basel, 1983). Fresh cells (~10«) are washed with buffer, for example 0.1 
M potassium phosphate, then resuspended in this same bufiFer containing a redud^g agrat, 
such as 50 mM DTT, incubated for 1 h at 30»C with gentle agitation, and then washed again 
with buffer to remove the reducing agent. These cells are then resuspended in buffer 
containing a cell wall degrading enzyme, such as Novozyme 234 (1 rag/mL), and any of a 
variety of osmotic stabilizers, such as sucrose, soibitol, NaCI, KCI, MgSO*, MgCh, orNH4Cl 
at any of a variety of concentrations. These suspensions are then incubated at 30»C with gentle 
shaking (-60 rpm) until protoplasts are released. To generate protoplasts that are more likely 
to produce productive fUsants sevo^ strate^es are possible. 

Protoplast formation can be increased if the odl cycle of the protoplasts have 
been synchronized to be halted at Gl. In the case of 51 cerevisiae this can be accomplidied by 
the addition of mating fectors, either a or a (Curnm & Carter, J. Gen. Microbiol 129, 
1589-1591 (1983)). Tliese p^des act as adenylate cyclase inhibitors which by decreasing 
the ceUutar level of cAMP arrest the ceU cycle at Gl. In addition, sex factors have been 
shown to induce the weakening of the ceil wall in preparation for the sexual fusion of a and o 
cells (CrandaD & Brock, Bacterial Rev. 32, 139-163 (1968); Osumi et al.. Arch. Microbiol 
97, 27-38 (1974)). Thus in the preparation of protoplasts, cdls can be treated witii mating 
factors or other known inhibitors of adenylate cyclase^ such as leflunomide or the killer toxin 
from K. lactis, to arrest them at Gl (Sugisaki et al.. Nature 304, 464-466 (1983)). Then after 
fiising of the protoplasts (step 2), cAMP can be added to the regeneration medium to induce 
S-phase and DNA synthesis. Alternatively, yeast strains having a temperature sensitive 
mutation in the CDC4 gene can be used, such that cells could be synchronized and arrested at 
Gl. After fusion ceUs are returned to the pennissive temperature so tiiat DNA synthesis and 
growth resumes. 
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Once suitable protoplasts have been prepared, it is necessary to induce fustpn 
by physical or chemical means. An equal number of protoplasts of each cell type is mixed in 
phosphate buflFer (0.2 M, pH 5.8, 2x10* cells/mL) containing an osmotic stabilizer, for 
example 0.8 M NaCI, and PEG 6000 (33% w/v) and then incubated at 30^C for 5 min while 
fusion occurs. Polyols, or other compounds that bind water, can be employed. The fusants 
are then washed and resuspended in the osmotically stabilized buffer lacking PEG, and 
transferred to osmotically stabilized regeneration medium on/in which the cells can be selected 
or screened for a desired property. 

8. Shuffling Methods Using Artificial Chromosomei^ 
Yeast artificial chromosomes (Yacs) are yeast vectors into which veiy large 

DNA fi-agments (e.g., 50-2000 kb) can be cloned (see, e.g., Monaco & Laiin, Trends. 

Biotech. 12(7), 280-286 (1994); Ranisay.M?/. B/o/ec/wio/. 1(2), 181-201 1994; Huxley, 

Genet Eng. 16, 65-91 (1994); Jakobovits, Curr. Biol 4(8), 761-3 (1994); Lamb & Gearhart, 

Curr. Opin. Genet. Dev. 5(3), 342-8 (1995); MontoUu ct al., Reprod. Fertil Dev. 6, 577-84 

(1994)). These vectors have telomeres (Tel), a centromere (Cen), an autonomously 

replicating sequence (ARS), and can have genes for positive (e.g., TRPl) and negative (e.g., 

URA3) selection. YACs are maintained, replicated, and segregate as other yeast 

chromosomes through both mdosis and mitosis thereby providing a means to expose cloned 

DNA to true mdotic recombination. 

YACs provide a vehicle for the shuffling of libraries of large DNA fi-agments in 
vivo. The substrates for shuffling are typically large fi^OTts from 20 kb to 2 Mb. The 
fragments can be random firagments or can be fragments known to encode a desirable 
property. For example, a fi-agment might include an operon of genes involved in production of 
antibiotics. libraries can also mclude whole genomes or chromosomes. Viral genomes and 
some bacterial genomes can be doned intact into a single YAC. In some libraries^ fi-agments 
are obtained from a single organism. Other Ubraries indude firagment variants, as where some 
libraries are obtained from difierent individuals or spedes. Fragment variants can also be 
genmted by induced mutation. Typically, genes within Segments are expressed from 
naturally associated regulatory sequences within yeast. However, alternatively, individual 
genes can be linked to yeast regulatory elements to toun an expression cassette, and a 
concatemer of such cassettes, each containing a different gene, can be inserted into a YAC. 

In some instances, fragments are incorporated into the yeast genome, and 
shuffling is used to evolve improved yeast strains. In other instances, fiagments remain as 
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components of YACs throughout the shuffling process, and after acquisition of a desired ^. 
property, the YACs are transferred to a desired recipient cell. 

9. Method s of Evolving Yeast Strains 
Fragments are cloned into a YAC vector, and the resuWng YAC library is 

5 transformed into competent yeast cells. Transformants containing a YAC are identified by 

selecting for a positive selection marker present on the YAC. The cells are allowed to recover 

and are thai pooled. Thereafter, the cells are induced to sporulate by transferring the cells 

from rich medium, to nitrogen and carbon limiting medium. In the course of sporulation, cells 

undergo mriosis. Spores are then induced to mate by return to rich media. OptionaBy, asci 

10 are lysed o liberate spores, so that the spofes can mate with other spores originating from 
other asci. Mating results in recombination between YACs bearing diflFerent inserts, and 
between YACs and natural yeast chromosomes. The latter can be promoted by iiradiating 
spores with ultra violet light. Recombination can pve rise to new phenotypes either as a result 
of genes expressed by fragments on the YACs or as a result of recombination with host genes, 

IS or both. 

Aft^ induction of recombination between YACs and natural yeast 
chromosomes, YACs are often eliminated by selecting against a negative selection maricer on 
the YACs. For example, YACs containing the marker XJRA3 can be selected against by 
propagation on media contauiing 5-fluro-orotic add. Any exogenous or altered graetic 

20 material that remmns is contained within natural yeast chromosomes. Optionally, further 
rounds of recombination between natural yeast chromosomes can be performed after 
elimination of YACs. Optionally, the same or Cerent libraiy of YACs can be transfonned 
into the cells, and the above steps repeated. By recursively repeating this process, the 
diversity of the population is increased prior to screening. 

25 After dimination of YACs, yeast are then screwed or sdected for a desired 

property. The property can be a new property conferred by transferred fiagments, such as 
production of an antibiotic. The property can also be an improved property of the yeast such 
as improved capadty to express or secr^e an exogenous protdn, improved 
recombinogaiidty, improved stability to tCTip^ature or solvents^ or other property required 

30 of commercial or research strains of yeast. 

Yeast strains surviving sdection/sCTcening are then subject to a further round 
of recombination. Recombination can be delusively between the chromosome of yeast 
surviving selection/screening. Alternatively, a library of fragments can be introduced into the 
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yeast cells and recombined with endogenous yeast chromosomes as before. This library qf 
fiagments can be the same or diflFerent from the library used in the previous round of 
transformation. For example, the YACs could contain a library of genomic DNA isolated 
from a pool of the improved strains obtained in the earlier steps. YACs are eliminated as 
5 before, followed by additional rounds of recombination and/or transformation with further 
YAC libraries. Recombination is followed by another round of selection/screening, as above. 
Further rounds of recombmation/screening can be performed as needed until a yeast strain has 
evolved to acquire the desired property. 

An exemplary scheme for evohdng yeast by introduction of a YAC library is 

10 shown in Fig. 10. The first part of the figure shows yeast containing an endogenous diploid 
genome and a YAC library of Augments representing variants of a sequence. The library is 
transformed into the ceUs to yield 100-1000 colonies per (ig DNA. Most transformed yeast 
cells now harbor a single YAC as well as endogenous chromosomes. Meiosis is induced by 
growth on nitrogen and carbon limiting medium. In the course of meiosis the YACs 

IS recombine with other chromosomes in the same cell. Haploid spores resulting from meiosis 
mate and regenerated diploid forms. The diploid forms now harbor recombinant 
chromosomes, parts of which come from endogenous chromosomes and parts from YACs. 
Optionally, the YACs can now be cured from the cells by selecting against a negative selection 
marker present on the YACS. Irrespective whether YACS are selected against, cells arc thra 

20 somied or selected for a desired property. Cells sundving selection/screening are 
transformed with another YAC library to start another shufiOing cycle. 

10. Method of Evolving YACs for Transfer to Recipient Strain 
These methods are based in part on the fact that multiple YACs can be 

harbored in the same yeast cell, and YAC-YAC recombination is known to occur (Green & 

25 Olson, Science 250, 94-98 1990)). Inter-YAC recombination provides a format for which 

fimilies of homologous genes harbored on fiagments of >20 kb can be shuffled in vivo. 

The starting population of DNA fiagments show sequence similarity with each other but differ 

as a result of for example, induced, allelic or species diversity. Often DNA fragments are 

known or suspected to mcode multiple genes that fimction in a common pathway. 

30 The firagments are cloned into a Yac and transformed into yeast, typically with 

positive selection for transformants. The transformants are induced to sporulate, as a result of 

which diromosomes undergo meiosis. The cells are then mated. Most of the resulting diploid 

cells now cany two YACs each having a different insert. These are again induced to sporulate 
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and mated. The resulting cells harbor YACs of recombined sequence. The cells can then be 
screened or selected for a desired property. Typically, such selection occurs m the yeast strain 
used for shuflBing. However, if fragments being shuflBed are not expressed in yeast, YACs can 
be isolated and transferred to an appropriate cell type in which they are expressed for 
screening. Examples of such properties include the synthesis or degradation of a deared 
compound, increased secretion of a desired gene product, or other detectable phenotype. 

Preferably, the YAC Ubraiy is transformed into haploid a and haploid a cells. 
These cells are then induced to mate with each other, i.e., they are pooled and induced to mate 
by growth on rich medium. The diploid cells, each carrying two YACs, are then transfeiTed to 
sporulation medium. During sporulation, the cells undergo meiosis, and homologous 
chromosomes recombine. In this case^ the genes harbored in the YACs will recorhbine, 
diversi^ong their sequences. The resulting haploid acospores are thm liberated from the asci 
by enzymatic degradation of the asci wall or other available means and the pooled liberated 
haploid acospores are induced to mate by transfer to rich medium. This process is repeated 
for sevaal cycles to increase the diversity of the DNA cloned into the YACs. The resulting 
population of yeast cells, preferably in the haploid state, are either screened for improved 
properties, or the diversified DNA is delivered to another host cell or organism for scteerdng. 

Cells surviving selection/screening are subjected to successive cycles of 
pooling, sporulation, mating and selection/screening until the desired phenotype has been 
observed. Recombination can be achieved simply by transferring cells fix)m rich medium to 
carbon and nitrogen limited medium to induce sporulation, and then returning the spores to 
rich media to mduce mating. Asci can be lysed to stiinulate mating of spores origmating from 
different asci. 

After YACs have been evolved to encode a desired prop^ty they can be 
transferred to other cell types. Transfix can be by protoplast fiision, or retransformation with 
isolated DNA. For example, transfer of YACs from yeast to mammalian cells is discussed by 
Monaco & Larin, Trends in Biotechnology 12, 280-286 (1994); Montoliu et al., ReprocL 
FertiL Dev. 6, 577-84 (1994); Lamb et al., Curr. Opin. Genet Dev. 5, 342-8 (1995). 

An excii^)lary scheme for sbufBing a YAC fragment libraiy in yeast is shown in 
Fig. 1 1 . A library of YAC fragments representing genetic variants are transformed into yeast 
that have diploid mdogenous chromosomes. The transformed yeast contmue to have diploid 
endogenous chromosomes, plus a smgle YAC. The yeast are induced to undergo meiosis and 
sporulate. The spores contain haploid graomes and are selected for those which contain a 
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YAC, using the YAC selective maricer. The spores are induced to mate generating diploid 
cells. The diploid cells now contain two YACs bearing diflFerent inserts as well as diploid 
endogenous chromosomes. The cells are again induced to undergo meiosis and sporulate. 
during meiosis, recombination occurs between the YAC inserts, and recombinant YACs are 
5 segregated to ascoytes. Some ascoytes thus contain haploid endogenous chromosomes plus a 
YAC chromosome with a recombinant insert. The ascoytes mature to spores, which can mate 
agmn generating diploid cells. Some diploid cells now possess a diploid complement of 
endogenous chromosomes plus two recombinant YACs. These cells can then be taken 
through fiirther cycles of meiosis, sporulation and mating. In each cycle, fiiith^ 
10 recombination occurs between YAC inserts'and further recombmant forms of inserts are 
generated. After one or several cycles of recombination has occurred, cells can be tested for 
acqui^tion of a desired property. Further cycles of recombination, followed by selection, can 
then be performed in similar fashion. 

11. In vivo ShuflHing of Genes bv the Recursive Mating of Yeast CeUs 
15 Harboring Homologous Genes in Identical Loci. 

A goal of DNA shuflOing is to mimic and expand the combinatorial capabilities 

of sexual recombination. In vitro DNA shuflBing succeeds m this process. However, by 

chan^ng the mechanism of recombination and altering the conditions under which 

recombination occurs, naturally in vitro recombination methods may jeopardize intrinsic 

20 information in a DNA sequence that renders it "evolvable." 

ShufiSing in vivo by employing the natural crossing over mechanisms that occur 

during meiosis may access inherent natural sequence information and provide a means of 

creating higher quality shuffled libraries. Described here is a method for the in vivo shufiEling 

of DNA that utilizes the natural mechanisms of meiotic recombination and provides an 

25 alternative method for DNA shuffling. 

The basic strategy is to clone genes to be shuffled into identical loci \i^thin the 

lu^loid genome of yeast. The haploid cells are then recursively induced to mate and to 

sporulate. The process subjects the cloned genes to recursive recombination during recursive 

cycles of meiosis. The resulting shuffled genes are then screened in in situ or isolated and 

30 screened imder different conditions. 

For ^cample, if one wished to shuffle a family of five lipase genes, the 

following provides a means of doing so in vivo. 

The open reading fiiame of eadi lipase is amplified by the PGR such that each 

ORF is flanked by identical 3' and S' sequences. The 5' flanking sequence is idmtical to a 
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region within the 5' codmg sequence of the S. cerevisiae ura 3 gene and the 3 ' flanking 
sequence is identical to a region within the 3' of the ura 3 gene. The flanking sequences are 
chosen such that homologous recombination of the PGR product with the ura 3 gene results in 
the incorporation of the lipase gene and the disruption of the ura 3 ORF. Both 5". cerevisiae a 
and a haploid cells are then transformed with each of the PCR amplified lipase ORFs, and 
cells having incorporated a lipase gene into the ura 3 locus are selected by growth on 5 fluoro 
orotic acid (5F0A is lethal to cells expressing functional IJRA3). The result is 10 cell types, 
two different mating types each harboring one of the five lipase genes in the disrupted ura 3 
locus. These cells are then pooled and grown under conditions where mating between the a 
and a cells are favored, e.g. in rich mediuni. 

Mating results in a combinatorial mixture of diploid cells having all 32 possible 
combinations of lipase genes in the two ura 3 lod. The ceDs are then induced to sporulate by 
growth under carbon and nitrogen limited conditions. During sporulation the diploid cells 
undeigo meiosis to fi)rm four (two a and two a) haploid ascospores housed in an ascus. 
During meiosis n of the sporulation process sister chromatids afign and crossover. The lipase 
genes doned into the urcO lod will also align and recombme. Thus the resulting haploid 
ascospores will represrat a library of cells each harboring a diffwent possftle chimeric lipase 
gene, each a unique result of the mdotic recombination of the two lipase genes in the original 
diploid cell The walls of asd are degraded by treatment with zymolase to libeiute and allow 
the nmdng of the individual ascospores. This mixture is then grown under conditions that 
promote the mating of the a and a haploid cells. It is important to liberate the indhddual 
ascospores, since mating will othCTwise occur between the ascospores within an ascus. 
Nfixing of the haploid cells allows recombination between more than two lipase genes, 
enabling "poolwise recombination." Mating brings together new combinations of cMmeric 
genes that can then undergo reconibination upon sporulation. The cdls are recursively cycled 
through sporulation, ascospore mixing, and mating until suffident diversity has been generated 
by the recursive pairwise recombination of the five lipase genes. The individual chimeric Upase 
genes dtiier can be screwed direcdy in the haploid yeast cells or transferred to an appropriate 
e7q)ression host. 

The process is described above for lipases and yeast; however, any sexual 
organisms into which genes can be directed can be employed, and any genes, of course, could 
be substituted for lipases. This process is analogous to tiie method of shuflBing whole 
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genomes by recursive painvise mating. The diversity, however, in the whole genome easels 
distributed throughout the host genome rather than localized to specific loci. 

12. Use of YACs to Clone Unlinked Genes 

Shuffling of YACs is particularly amenable to transfer of unlinked but 

5 functionally related genes fi-om one species to another, particularly where such genes have not 
been identified. Such is the case for several commercially important natural products, such as 
taxol. Transfer of the genes in the metabolic pathway to a difiFerent organism is often desirable 
because organisms naturally producing such compounds are not well suited for mass culturing, 

10 Clusters of such genes can be isolated by cloning a total genomic library of 

DNA fi-om an organisms producing a useful compound into a YAC library. The YAC library 
is then transformed into yeast. The yeast is sporulated and mated such that recombination 
occurs between YACs and/or between YACs and natural yeast chromosomes. 
Selection/screening is then perfonned for expression of the desired collection of genes. If the 

1 5 genes encode a biosynthetic pathway, expression can be detected from the appearance of 

product of the pathway. Production of individual enzymes in the pathway, or intermediates of 
the final expression product or capacity of cells to metabolize such intermediates indicates 
partial acquisition of the synthetic pathway. The original library or a different library can be 
introduced into cells surviviiig/selection screening, and further rounds of recombination and 

20 selection/screening can be performed until the end product of the desired metabolic pathway is 
produced. 

13. YAC-YACShufHin^z 

If a phenotype of interest can be isolated to a single stretch of genomic DNA 

less than 2 megabases in length, it can be doned into a YAC and replicated in S. cerevisiae. 

25 The cloning of similar stretches of DNA fifom related hosts into an identical YAC results in a 
population of yeast cdls each harboring a YAC having a homologous insert effecting a desired 
phraotype. The recursive breeding of these yeast cells allows the homologous re^ons of 
these YACs to recombine during meiosis, allowing genes, pathways, and chisters to recombine 
during each cycle of meiosis. After several cycles of mating and segregation, the YAC inserts 

30 are well shuffled. The now very diverse yeast library could then be screened for phenotypic 
improvraients resulting fiom the shuf9ing of the YAC inserts. 
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14. YAC-C hromosome Shuffling ^ 
**Mitotic" recombination occurs during cell division and results from the 

recombination of genes during replication. This type of recombination is not limited to that 

between sister chromatids and can be danced by agents that induce recombination 

macMnery, such as nicking ch«nicals and ultraviolet irradiation. Since it is often diflBcult to 

directly mate across a species barrier, it is possible to induce the recombination of homologous 

genes originating from diflFerent species by providing the taiget genes to a desired host 

organism as a YAC library. The genes harbored in this library are then induced to recombine 

with homologous genes on the host chromosome by enhanced mitotic recombmation. This 

process is carried out recursively to generate a library of diverse organisms and then screened 

for those having the desired phenotypic improvements. The improved subpopulation is then 

mated recursively as above to identify new strains having accumulated multiple usefijl genetic 

alterations. 

15. Accumul ation of Multiple YACs Harboring Useful Genes 
The accumulation of multiple unlinked genes that are required for the 

acquisition or improvement of a given phenotype can be accomplished by the shuffling of 

YAC libraries. Genomic DNA from organisms having desired phenotypes, such as ethanol 

tolerance, thermotolerance, and the ability to ferment pentose sugars are pooled, fragmented 

and cloned into several diflFerent YAC vectors, each having a diflferent selective marker Qis, 

lira, ade, etc). 5. cerevisiae are transformed with these libraries, and selected for their 

presence (using selective media i.e uracil dropout media for the YAC containing the Ura3 

selective marker) and then screened for having acquired or improved a desired phenotype. 

Surviving cells are pooled, mated recursively, and selected for the accumulation of multiple 

YACs (by propagation in medium with multiple nutritional dropouts). CeUs that acquire 

multiple YACs harboring usefiil genomic inserts are idratified by further screening. Optimized 

strains can be used directly, however, due to the burden a YAC may pose to a cell, the 

rdevant YAC inserts can be minimized, subcloned, and recombined into the host chromosome, 

to gmerate a more stable production strain. 

16. Choice of Host SSF Qrpanism 

One example use for the present invention is to create an improved yeast for 

the production of ethanol from lignocellulosic biomass. Spedfically, a yeast strain with 
improved ethanol tolerance and thermostability/thermotolCTance is desirable. Parent yeast 
strains known for good behavior in a Simultaneous Saccharification and Fermentation (SSF) 
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process are identified. These strains are combined with others known to possess ethanol 
tolerance and/or thermostability. 

5. cerevisiae is highly amenable to development for optimized SSF processes. 
It inherently possesses several traits for this use, including the ability to import and ferment a 
5 variety of sugars such as sucrose, glucose, galactose, maltose and mahriose. Also, yeast has 
the capability to flocculate, enabling recovery of the yeast biomass at the end of a fennentation 
cycle, and allowing its re-use in subsequent bioprocesses. This is an important property in that 
it optimizes the use of nutrients in the growth medhim. S. cerevisiae is also highly amenable 
to laboratory manipulation, has highly characterized genetics and possesses a sexual 

10 reproductive cycle. S. cerevisiae may be grown under either aerobic or anaerobic conditions, 
in contrast to some other potential SSF organisms that are strict anaerobes (e.g. Clostridium 
spp.), making them very difficult to handle in the laboratory. S. cerevisiae are also "generally 
regarded as safe" C^GRAS"), and, due to hs widespread use for the production of important 
comestibles for the general public (e.g. beer, wine, bread, etc), is generally familiar and well 

15 known. S. cerevisiae is commonly used in fermentative processes, and the &miliarity in its 
handling by fermentation experts eases the introduction of novel improved yeast strains into 
the industrial setting. 

£ cerevisiae strains that previously have been identified as particulariy good 
SSF organisms, for example, S. cerevisiae D5A (ATCC200062) (South CR and Lynd LR. 

20 (1994) Appl Biochem. Biotcchnol 45/46: 467-481; Ranatunga TD et ai. (1997) Biotechnol 
Lett. 19: 1 12S-1 127) can be used for starting materials. In addition, other industrially used 5. 
cerevisiae stndns are optionally used as host strains, particulariy those showing desirable 
fermentative characteristics, such as S. cerevisiae Y567 (ATC<24858) (Sitton OC et al. 
(1979) Process Biochem. 14(9): 7-10; Sitton OC et al. (1981) Adv. Biotechnol. 2: 231-237; 

25 McMurrough I et al. (1971) Folia Microbiol. 16: 346-349) and S. cerevisiae ACA 174 (ATCC 
60868) (BMitez T et al. (1983) i^pl. Environ. Microbiol. 45: 1429-1436; Chem. Eng. J. 50: 
B17-B22, 1992), which have been shown to have desirable traits for large- scale fermentation. 

17. Choice of Ethanol T olerant Strains 
Many strains of £ cerevisiae have been isolated fi-om faigh-ethanoi 

30 environmoits, and have survived in the ethanol-rich enviroimient by adaptive evolution. For 

example, strains from Sherry wine aging (TFlor" strains) have evoWed highly fimctional 

mitochondria to enable thdr survival m a high-ethanol environment. It has been shown that 

transfer of these wine yeast mitochondria to other strains ino^eases the recipient's resistance to 
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high ethanol concentration, as well as therraotolerance (rimenez, J. and Benitez, T (1988) ^ 
Curr. Genet. 13: 461-469). There are several flor strains deposited in the ATCC, for example 
5. cerevisiae MY91 (ATCC 201301), MY138 (ATCC 201302), C5 (ATCC 201298), ET7 
(ATCC 201299), LA6 (ATCC 201300), 0SB21 (ATCC 201303), F23 (5, globostis ATCC 
5 90920). Also, several flor strains of 5. uvarum and Torulaspora pretoriensis have been 

dq)osited. Other ethanol-tolerant wine strains include S. cerevisiae ACA 174 (ATCC 60868), 
15% ethanol, and 5. cerevisiae A54 (ATCC 90921), isolated from wine containing 18% (v/v) 
ethanol, and NRCC 202036 (ATCC 46534), also a wine yeast Other S. cerevisiae 
ethanologens that additionally exhibit oihanced ethanol tolerance include ATCC 248S8, 
10 ATCC 24858, G 3706 (ATCC 42594), NRRL Y-265 (ATCC 60593), and ATCC 24845 - 
ATCC 24860. A strain ofS. pastorianus (£ carlsbergensis ATCC 2345) has high ethanol- 
tolerance (13% v/v). S. cerevisiae Sa28 (ATCC 26603), from Jamaican cane juice sample, 
produces high levels of alcohol from molasses, is sugar tolerant, and produces ethanol from 
wood acid hydroiyzate. 

15 Sevo'alofthe listed strains, as well as additional strains can be used as starting 

materials for breeding ethanol tolerance. 

18. Choice of Temperature Tolerant Strains 
A few temperature tolerant strains have been reported, including the highly 

fiocculent strain S. pastorianus SA 23 {S. carlsbergensis ATCC 26602), which produces 

20 ethanol at elevated temperatures, and i^. cerevisiae Kyokai 7 (£ sake, ATCC 26422), a sake 

yeast tolerant to brief heat and oxidative stress. Ballesteros et al ((1991) A ppl. Biochem. 

Biotechnol. 28/29: 307-3 1 5) examined 27 strains of yeast for thdr ability to grow and ferment 

glucose in the 32-45^C temperature range^ including Saccharomyces, Kluyveromyces and 

Carutida spp. Of these, the best thennotolerant clones were Kluyveromyces marxianus LG 

25 and Kluyveromyces fragilis 2671 (Ballesteros et al (1993) A ppl Biochem. Biotechnol. 39/40: 

201-21 1). S. cerevisiae-pretoriensis FDHI was somewhat thennotolerant, however was poor 

in ethanol tolerance. Recursive recombmation of this strain with others that display ethanol 

tolerance can be used to acquire the thennotolerant characteristics of the strain in progeny 

whidi also display ethanol tolerance. 

30 Candida acidothermophilum {Issatchenkia orientalis, ATCC 20381) is a good 

SSF strain that also exhibits improved performance in ethanol production from lignocellulosic 

biomass at higher SSF temperatures than S. cerevisiae D5A (Kadam, KL, Schmidt, SL (1997) 
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Appl. Micr obiol. Biotechnol. 48: 709-713). This strain can also be a genetic contributor to an 
improved SSF strain. 

19. Shuffling of Strmns 

In those instances where strains are highly related, a recursive mating strategy 

may be pursued. For example, a population of haploid S. cerevisiae (a and alpha) are 
mutagenized and screened for improved EtOH or thermal tolerance. The improved haploid 
subpopulation are mixed together and mated as a pool and induced to sponilate. The resulting 
haploid spores are freed by degrading the asci wall and mixed. The freed spores are then 
induced to mate and sponilate recursively. This process is repeated a sufficient number of 
times to generate all possible mutant combhiations. The whole genome shuffled population 
(haploid) is then screened for further EtOH or thermal tolerance. 

When strains are not sufficiently related for recursive mating, formats based on 
protoplast fusion may be employed. Recursive and poolwise protoplast fusion can be 
performed to generate chimeric populations of diverse parental strains. The resultant pool of 
progeny is selected and screened to identify improved ethanol and thermal tolerant stnuns. 

Alternatively, a YAC-based Whole Genome Shuffling format can be used. In 
this format, YACs are used to shuttle large chromosomal fingments between stnuns. As 
detailed earlier, recombination occurs between YACs or between YACs, and the host 
chromosomes. Genomic DNA fix)m organisms having desired phenotypes are pooled, 
fragmmted and cloned into several different YAC vectors, each having a diflferent selective 
marker (his, ura, ade, etc). £ cerevisiae are transformed with these libraries, and selected for 
their presence (using selective media, i.e. uracil dropout media for the YAC containing the 
Ura3 selective marker) and then screened for having acquired or improved a desired 
phoiotype. Surviving cells are pooled, mated recursively (as above), and selected for the 
accumulation of multiple YACs (by propagation in medium with multiple nutritional 
dropouts). Cdk that acquire multiple YACs hari)oring usefiil genomic inserts are identified by 
further screening (see below). 

20. Selection for Improved Strains 

Having produced large libraries of novel strains by mutagenesis and 

recombination, a first task is to isolate those strains that possess improvements in the de^ed 
phmotypes. Idmtification of the organism libraries is fiunlitated where the desired key traits 
are sdectable phenotypes. For example, etiianol has different effects on the growth rate of a 
yeast population, viabiUty, and fermentation rate. Inhibition of cell growth and viability 
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increases with ethanol concentration, but high fermentative capacity is only inhibited at hifdier 
ethanol concentrations. Hence, selection of growing cells in ethanol is a viable approach to 
isolate ethanol-tolerant strains. Subsequently, the selected strains may be analyzed for their 
fermentative capacity to produce ethanol. Provided that growth and media conditions are the 
5 same for all strains (parents and progeny), a hierarchy of ethanol tolerance may be 
constructed. 

Simple selection schemes for identification of thermal tolerant and ethanol 
tolerant strams are available and, in this case, are based on those previously designed to 
identify potentially useful SSF strains. Selection of ethanol tolerance is performed by exposing 

10 the population to ethanol, then plating the population and looking for growth. Colonies 

capable of growing after exposure to ethanol can be re-exposed to a higher concentration of 
ethanol and the cycle repeated until the most tolerant strains are selected. In order to discern 
strains possessing heritable ethanol tolerance from with temporarily acquired adaptations, 
these cycles may be punctuated with cycles of growth in the absence of selection (e.g. no 

IS ethanol). 

Alternatively, the mixed population can be grown direcdy at increasing 
concentrations of ethanol, and the most tolerant strains enriched (Aguiiera and Benitez, 1986, 
Arch Microbiol 4:337-44). For example this enrichment could be carried out in a chemostat or 
turi>ido5tat. Similar selections can be developed for thermal tolerance, in which strains are 

20 identified by their ability to grow after a heat treatment, or directly for growth at elevated 
tempwatures (Ballesteros et al., 1991, Applied Biochem and Biotech, 28:307-315). The best 
str^ns identified by these selections will be assayed more thoroughly in subsequent screens for 
ethanol, thermal tolerance or other prop^es of interest. 

In one aspect, organisms having increased ethanol tolerance are selected for. A 

25 population of natural £ cerevisae isolates are mutag^oized. This population is then grown 
under fennentor conditions xmder low initial ethanol concentrations. Once the culture has 
reached saturation, the culture is diluted into fi^esh medium hawig a slightly hi^er ethanol 
content. This process of successh^e dilution into medmm of incrementally increasing ethanol 
conc^xtration is continued until a threshold of ethanol tolerance is reached. The surviving 

30 mutant population having tiie highest ethanol tolerance are thai pooled and their genomes 
recombined by any method noted herein. Enrichment could also be achieved by a continues 
culture in a chemostat or turbidostat in which temperature or etiianol concentrations are 
progressively elevated. The resulting shuffled population are then exposed once agam to the 
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enrichment strategy but at a higher starting medium ethanol concentration. This strategy i^^^ 
optionally applied for the enrichment of thermotolerant cells and for the enrichment of cells 
having combined thermo- and ethanol tolerance. 

21. Screening for Improved Strains 
Strains showing viability in initial selections are assayed more quantitatively for 

improvements in the desired properties before being reshufSed with other strains. 

Progeny resulting from mutagenesis of a strain, or those pre-selected for their ethanol 

tolerance and/or thermostability, can be plated on non-selective agar. Colonies can be picked 

robotically into microtiter dishes and grown. Cultures are replicated to fresh microtiter plates, 

and the replicates are incubated under the appropriate stress condition(s). The growth or 

metabolic activity of individual clones may be monitored and ranked. Indicators of viability 

can range from the size of growing colonies on solid media, density of growing cultures, or 

color diange of a metabolic activity incBcator added to liquid media. Strains that show the 

greatest viability are then nuxed and shuffled, and the resulting progeny are rescreened under 

more stringent conditions 

11. Develop ment of an Ethanologen Capable of Converting Cellulose to 
Ethy^pl 

Once a strain of yeast exhibiting thermotolerance and ethanol tolerance is 
developed, the degradation of ceDulose to monomeric sugars is provided by the mchision to 
the host strain of an efficient cellulase d^radation pathway. 

Additional desirable characteristic can be useful to enhance the production of 
ethanol by the host. For example, inclusion of heterologous en:^es and pathways that 
broaden the substrate sugar range may be performed, tuning" of the strain can be 
accon^lished by the addition of various other traits, or the restoration of certain endogenous 
traits that are de^ble, but lost during the Fecombmation procedures. 

23. Conferring of Cellulase Acrivitv 
A vast number of celhilases and cellulase degradation systems have been 

diaract«ized from fimgi, bacteria and yeast (see reviews by Beguin, P and AubCTt, J-P (1994) 

FEMS Microbiol Rev. 13- 25-58; Ohima, K. et at. (1997) BiotechnoL Genet. Eng. Rev 14: 

365-414). An enzymatic pathway required for eflSdent sacdiarification of cellulose involves 

the syneigistic action of endogjucanases («ido-l,4-p-D-glucanases, EC 3.2. 1.4), 

exocellobiohydrolases (exo-l,4-p4>-ghicanases, EC 3.2.1.91), and p-glucosidases (cellobiases, 
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1,4-p-D-glucanases EC 3.2.1.21) (Fig: 9). The heterologous production of cellulase enzynjes 
in the ethanologen would enable the saccharification of cellulose, producing monomeric sugars 
that may be used by the organism for ethanol production. There are several advantages to the 
heterologous expression of a functional cellulase pathway in the ethanologen. For example, 
the SSF process would eliminate the need for a separate bioprocess step for saccharification, 
and would ameliorate end-product inhibition of cellulase enzymes by accumulated intermediate 
and product sugars. 

Naturally occurring cellulase pathways are inserted into the ethanologen, or 
one may choose to use custom improved *1iybrid" cellulase pathways^ employing the 
coordinate action of cellulases derived fix)m diflferent natural sources, including thmnophiles. 

Several cellulases from nonSaccharomyces have been produced and secreted 
from this organism successfully, including bacterial, fungal, and yeast en2ymes, for example T. 
reesei CBH I ((Shoemaker (1994), in 'The CeUulase System of Trichoderma reesei: 
Trichoderma strain improvement and Expression of Trichoderma celluloses in Yeast,*" Online, 
Pinner, UK, 593-600). It is possible to employ straightforward metabolic engineering 
techniques to engmder cellulase activity in Saccharowyces, Also, yeast have been forced to 
acquire elmients of ceDulose degradation pathways by protoplast fusion (e.g. intergeneric 
hybrids of Sacchcwomyces cerevisiae and Zygosaccharomyces fermentati, a cellobiase- 
producing yeast, have been created (Pina A, et. al. (1986) A ppl. Environ. MicrobioL 51: 995- 
1 003). In general, any cellulase component enzyme that derives from a closely related yeast 
oiiganism could be transferred by protoplast fiision. Cellobiases produced by a somewhat 
broader range of yeast may be accessed by whole genome shuffling in one of its many formats 
(e.g. whole, fragmented, YAC-based). 

Optimally, the cellulase enzymes to be used should ^diibit good synergy, an 
appropriate level of repression and secretion from the host, good specific activity (i.e. 
resistance to host degradation &ctors and enzyme modification) and stability in the de^d 
SSF environment. An rcample of a hybrid cellulose degradation pathway having excellent 
synergy includes the foUovwng enzymes: CBH I exocellobiohydrolase of Trichoderma reesei, 
theAcidothermus cellulofyticus El endoglucanase, and the Thermomonospera fuscaES 
exocellulase (Baker, et al. (1998) AppI. Biochem. Biotechnol. 70-72: 395-403). 

It is suggested hm that these aizymes (or improved mutants thereof) be 
considwed for use m the SSF organism, along with a cellobiase (jj-gjucosidase), such as that 
from Candida peltata. Othw possible ceDulase systems to be considered should possess 
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particularly good activity against crystalline cellulose, such as the T. reesei ceUuiase system 
(TcCTi, TT, et al. (1998) Biochem. See. Trans. 26: 173-178), or possess particularly good 
thermostability characteristics (e.g. cellulase systems from thermophilic organisms, such as 
Thermomonospora fusca (Zhang, S., et. al. (1995) Biochcm. 34: 3386-335). 

A rational approach to the cloning of cellulases in the ethanologenic yeast host 
could be used. For example, known cellulase genes are cloned into expression cassettes 
utilizing S. cerevisiae promoter sequences, and the resultant linear fragments of DNA may be 
transformed into the recipient host by placing short yeast sequences at the termini to 
^courage site-specific integration into the genome. This is preferred to plasmidic 
transformation for reasons of genetic stability and maintenance of the transforming DNA. 

If an entire cellulose degradative pathway were introduced, a selection could be 
implemented in an agar-plate-based format, and a large number of clones could be assayed for 
cellulase activity in a short period of time. For example, selection for an exocellulase may be 
accessible by providing a soluble oligocellulose substrate or caiboxymethylcellulose (CMC) as 
a sole carbon source to the host, otherwise unable to grow on agar containing this sole carbon 
source. Clones producing active cellulase pathways would grow by virtue of their ability to 
produce glucose. 

Ahematively, if the different cellulases were to be introduced sequentially, it 
would be useful to first introduce a cellobiase, enabling a selection using commerdally 
available cellobiose as a sole carbon source. Several strains of S. cerevisiae that are able to 
grow on cellobiose have been created by introduction of a cellobiase gene (e.g. Rajoka MI, et. 
al. (1998) FlQiaMicrQlyiQl (Praha) 43, 129-135; Skory, CD, et, al. (1996) Curr. Genet. 30, 
417-422; D'Auria, S, et, al. (1996) A ppl. Biochem, Biotechnol. 61, 157^166; Adam, AC, et. 
al. (1995) Yeast 11, 395-406; Adam, AC (1991) Curr. Genet . 20, 5-8). 

Subsequent transformation of this organism vnth CBHI exocellulase can be 
selected for by growth on a cellulose substrate such as caiboxymethylceUulose (CMC). 
Finally, addition of an endogilucanase creates a yeast strain with improved crystalline 
degradation capacity. 

24. Conferring of Pento se Sugar TltiliTatinn 

Inclusion of pentose sugar utilization pathways is an important facet to a 
potottially usefiil SSF organism. The successfiil expression of x^ose sugar utilization 
pathways for ethanol production has been rqjorted in Saccharomyces (e.g. Chen, ZD and Ho, 
NWY (1993 ) AdpI. Biochem. Biotechnol. 39/40 135-147). 
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It would also be usefiil to accomplish L-arabinose substrate utilization for 
ethanol production in the Saccharomyces host. Yeast strains that utilize L-arabinose include 
some Candida and Pichia spp. (McMillan JD and Boynton BL (1994) Appl. Biochem. 
Piotechnol. 4S46: 569-584; Dien BS, et al. (1996) i^pl Biochem. Biotechnol. 57-58: 233- 
242). Genes necessary for arabinose fermentation in £. coli could also be inroduced by 
rational means (e.g. as has been performed previously in Z mobilis ^eanda K, et. al. (1996) 
Appl, Envirgn, Micro|)ipl. 62: 4465-4470)) 

25. Conferring of Ot her Useful Activities 
Several other traits that are important for optimization of an SSF strain have 

been shown to be transferable to cerevisiae. Like thermal tolerance, cellulase activity and 

pentose sugar utilization, these traits may not normally be exhibited by Saccharomyces (or the 

particular strain of Saccharomyces being used as a host), and may be added by genetic means. 

For example, expression of human muscle acylphosphatase in S. cerevisiae has been suggested 

to increase ethanol production (Rougei, G., et. al. (1996) Biotechnol. Add. Biochem. 23: 273- 

278). 

It can occur that evolved stress-tolerant SSF stram acquire some undesirable 
mutations in the course of the evolution strategy. Indeed, this is a pervasive problem in strain 
improvement strategies that rdy on mutagenesis techniques, and can result in highly unstable 
or fiagile production strains. It is possible to restore some of these desirable traits by rational 
methods such as cloning of specific genes that have been knocked out or negatively influenced 
in the previous rounds of strain improvement. The advantage to tins approach is specificity- 
the offendmg gene may be targeted directiy. The disadvantage is that it may be time- 
consuming and repetitious if several genes have been compromised, and it only addresses 
problems that have been characterized. A preferred (and more traditional) approach to the 
removal of undesirable/deleterious mutations is to back-cross the evolved strain to a desirable 
par^ strain (e.g. the ori^al 'liost** SSF strain). This strategy has been enq)loyed 
successfiiUy throughout strain improvement where accessible (i.e. for organisms that have 
sexual cydes of reproduction). When lacking the advantage of a sexual process, tt has been 
accomplished by using otiier methods, such as parasexual recombination or protoplast fiision. 
For example, the ability to flocculate was conferred on a non-flocculating strain of £ 
cerevisiae by protoplast fiision with a flocculation compet^t S. cerevisiae (Watari, J., et. al 
(1990) Aerie. Biol. Chem. 54: 1677-1681). 
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N. IN VITRQ WHOLE GENOM E SHUFFLING 

The shufiBing of large DNA sequences, such as eukaryotic chnnosomes, is 

difiBcuIt by prior art in vitro shufiBing methods. A method for overcoming this limitation is 

described herein. 

The cells of related eukaryotic species are gently lysed and the intact 
chromosomes are libo-ated. The liberated chromosomes are then sorted by FACS or similar 
method (such as pulse field electrophoresis) with chromosomes of similar size being 
sequestered together. Each size fi:action of the sorted chromosomes generally will represent a 
pool of analogous chromosomes, for example the Y chromosome of related mammals. The i 
goal is to isolate intact chromosomes that have not been irreversibly damaged. 

The fragmentation and reassembly of such large complex pieces of DNA 
employing DNA polymerases is difficult and would likely introduce an unacceptably high level 
of random mutations. An alternative approach that employs restriction enzymes and DNA 
ligase provides a feasible less destructive solution. A chromosomal fraction is digested with 
one or more restriction enzymes that recognize long DNA sequences {-15-20bp), such as the 
intron and intein encoded endonucleases (I-Ppo I, hCeu I, Vl-Psp I, PI-77/ 1, fl-Sce I (VDE). 
These enzymes each cut, at most, a few times wthm each chromosome, resulting in a 
combinatorial mixture of large fragments, each having overhanging single stranded termini that 
are complementary to other sites cleaved by the same enzyme. 

The digest is further modified by very short mcubation with a single stranded 
exonuclease. The polarity of the nuclease chosen is dependent on the single stranded 
overiiang resulting from the restriction enzyme chosen. 5 '-3' exonuclease for 3 '-overhangs, 
and 3'-5'- ©conuclease for 5*overhangs. This digestion resuks in significantly long regions of 
ssDNA overhang on each dsDNA termini. The purpose of this incubation is to generate 
r^ons of DNA that define specific regions of DNA where recombination can occur. The 
fragments are then incubated und^ condition where the ends of the fiagmmts anneal with 
other fiagments havmg homologous ssDNA termini. Often, the two fiagm«ts annealing will 
have originated fix>m dififerent chromosomes and in the presence of DNA ligase are covalendy 
linked to form a chimeric chromosome. This generates genetic diversity mimicking the 
crosdng over of homologous chromosomes. The conq)lete ligation reaction will contain a 
combinatorial nuxture of all possible ligations of fi:agments having homologous overhanging 
termini A subset of tUs population will be complete chimeric chromosomes. 

To screen the shuffled library, the chromosomes are delivered to a suitable host 
in a manner allowing for the uptake and expression of entire diromosomes. For example, 
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YACs (yeast artifidal chromosomes) can be delivered to eukaiyotic ceDs by protoplast fusion. 
Thus, the shufBe library could be encapsulated m liposomes and fused with protoplasts of the 
appropriate host cell. The resulting transformants would be propagated and screened for the 
desired cellular improvements. Once an improved population was identified, the 
5 chromosomes would be isolated, shuffled, and screened recursively. 

O. WHOLE GENOME SHUFFLING OF NATimALLY COMPETENT 
MICROORGANISMS 

Natural conq>etence is a phenomenon observed for some microbial species 
whereby individual cells take up DNA from the environment and mcorpoiate it into their 

10 genome by homologous recombination. Bacillus subtilis and Aceiinetobacter spp. are known 
to be particulariy efficient at this process. A method for the whole genome shuflBing (WGS) 
of these and analogous organi^s is d^cribed employing this process. 

One goal of whole genome shufiling is the rapid accumulation of useful 
mutations from a population of individual strains into one superior strain. If the organisms to 

15 be evolved are naturally competent, then a spUt pooled strategy for the recursive 

transformation of naturally competent cells with DNA originating from the pool will effect this 
process. An example procedure is as follows. 

A population of naturally competent organisms that demonstrates a variety of 
usefid traits (such as increased protein secretion) is identified. The strains are pooled, and the 

20 pool is split. One half of the pool is used as a source of gPNA, while the other is used to 
generate a pool of naturally competent cells. 

The competent cells are groAvn in the presence of the pooled gDNA to allow 
DNA uptake and recombination. Cells of one genotype uptake and incorporate gDNA from 
ceils of a different type generating cells having chimeric genomes. The result is a population 

2S of cells representing a combinatorial mbcture of the genetic variations originating in the 
original pool- These cells are pooled again and transformed with the same source of DNA 
again. This process is carried out recursively to increase the diversity of the genomes of cells 
resulting fi^m transformatioa Once sufficient diversity has been gen^^ed, the cdl 
population is screened for new chimeric organisms demonstrating desired iiiiprovements. 

30 This process is enhanced by increasing the natural competence of the host 

OTganisnt COMS is a protdn that, when expressed in B. subtilis, enhances the effidency of 
natural competence mediated transformation more than an order of magnitude. 

It was demonstrated that qiproxfanately 100% of the cdls haiboiing the 
plasmid pCOMS uptake and recombine genomic DNA fragments into thdr genomes. In 
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general, approximately 10% of the genome is recombined into any given transformed cell. J'his 
observation was demonstrated by the following. 

A strain of B, subtilis pCOMS auxotrophic for two nutritional markers was 
transformed with genomic DNA (gONA) isolated from a prototrophic strain of the same 
oi]ganism. 10% of the cells exposed to the DNA were prototrophic for one of the two nutrient 
markers. The avenge size of the DNA strand taken up by B. subtilis is approximately 50kb or 
~2% of the genome. Thus 1 of every ten cells had recombined a marker that was represented 
1 in every fifty molecules of uptaken gDNA. Thus, most of the cells take up and recombine 
with ^proximately five SOkb molecules or 10% of the genome. This method represents a 
powerfiil tool for rapidly and efiBciently recbmbining whole microbial genomes. 

In the absence of pCOMS, only 0.3% of the ceils prepared for natural 
con4)etency uptake and integrate a specific marker. This suggested that about 15% of the 
cells actually underwent recombination with a single genomic fitigment. Thus, a recursive 
transformation strategy as described above produces a whole genome shuffled library, even in 
the absence of pCOMS. In the absence of pCOMS, however, the complex genomes will 
represent a smaller, but still screenable percoitage of the transformed or shuffled population. 

P. CONGRESSION 

Congresfflon is the integration of two independent unlinked maricers into a cell. 
0.3% of naturally competent B. subtilis ceQs integrate a single marker (described above). Of 
these, about 10% have taken up an additional marker. Thus, if one selects or screens for the 
integration of one spedfic marker, 10% of the resulting population will have integrated 
another specific marker. This provides a way of enridiing for specific integration events. 

For example, if one is looking for the integration of a gene for which there is no 
easy screen or selection, it innll east as 0.3% of the cell populatioa If the population is first 
selected for a spedfic integration event, then the desired integration will be found in 10% of 
the populatioa This represents a significant (-30-fold) enrichment for the desired event. This 
enridmient is defines as the ""congression effect.** The congression effect is not influraced by 
the presence of pCOMS, thus the "^COMS effect** is simply to increase the percentage of 
naturally competent cells that are truly naturally competent fit)m about 15% in its absence to 
100% in its presence. All competent cells still uptake about the same amount of DNA or 
'"^l 0% of the Badllus genome. 

The congresaon effect can be used hi the following examples to enhance whole 
genome shuffling as well, as the targeted kitegration of shuffled graes to the chromosome. 
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O. A ^g/iBm/y SHUFFLING 

A population of B, subtilis ceUs having desired properties are identified, pooled 
and shuffled as described above with one exception: once the pooled population is split, half of 
the population is transformed with an antibiotic selection marker that is flanked by sequence 
that targets its integration and disruption of a specific nutritional gene, for example, one 
involved in amino biosynthesis. Transformants resistant to the drug are auxotrophic for that 
nutrient. The resistant population is pooled and grown under conditions rendering them 
naturally competent (or optionally first transformed with pCOMS). 

The competent cells are then transformed with gDNA isolated from the origmal 
pool, and prototrophs are selected. The prototrophic population will have undergone 
recombination with genomic fi-agments encoding a functional copy of the nutritional marker, 
and thus will be enriched for cells having undergone recombination at other genetic loci by the 
congression effect. 

R. TARGETING OF GENES AND GENE LffiRARIES TO THE 
CHROMOSOME 

It is useful to be able to efficiently deliver graes or gene libraries directly to a 
q)edfic location in a cells chromosome. As above, target cells are transformed with a positive 
selection maricer flanked by sequ^ces that target its homologous recombination into the 
chromosome. Selected cells harboring the marker are made naturally competent (with or 
without pCOMS, but preferably the former) and transformed with a mixture of two sets of 
DNA fiBgments. The first set contains a gene or a shuffled library of genes each flanked with 
sequence to target its integration to a spedfic chromosomal loci. The second set contains a 
positive selection marker (different fiom that first integrated into the cells) flanked by 
sequence that will target its integration and replacement of the first positive selection marker. 
Under optimal conditions, the mbcture is such that the gene or gene library is in molar excess 
over the positive selection marker. Transformants are then selected for cells containing the 
new positive marker. These cells are enridied for cells having integrated a copy of the desired 
gene or gene library by the congresdon effect and can be directly screwed for cells harboring 
the gene or gene variants of mterest. This process was carried out using PCR fi-agments 
<10kb, and it was found that, employing the congression effect, a population can be enriched 
such.that 50% of the cells are congregants. Thus, one in two cells contained a gene or gene 
variant. 
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Alternatively, the expression host can be absent of the first positive selection 
marker, and the competent cells are transformed with a mixture of the target genes and a 
limiting amount of the first positive selection marker fiiigment Cells selected for the positive 
marker are screened for the desired properties in the targeted genes. The improved genes are 
amplified by the PCR, shuffled again, and then returned to the original host again with the first 
positive selection marker. This process is carried out recursively until the desired fimction of 
the genes are obtained. This process obviates the need to construct a primary host strain and 
the need for two positive markers. 

S. CONTUGATTQTsf.M EDIATED OKNETIC EXCHA^ Tf^P 
Conjugation can be employed in the evolution of cell genomes in sevend ways. 
Conjugative transfer of DNA occurs during contact between cells. See Gumey (1 993) in: 
Bacterial Conjugation (CleweU, ed.. Plenum Press, New York), pp. 75-104; Reimmann & 
Bans in Bacterial Conjugation (Clewell, ed.. Plenum Press, New York 1993), at pp.l37-18g 
(incorporated by reference in their entirety for aU purposes). Conjugation occurs between 
many types of gram negative bacteria, and some types of gram positive bacteria. Conjugative 
transfer is also known between bacteria and plant ceUs (Agrobacterium tumefadrais) or yeast. 
As discussed in patent 5,837,458, the genes responsible for coiijugative transfer can 
themselves be evolved to expand the nmge of ceU types (e.g., fiom bacteria to mammals) 
between whidi sudi transfer can occur. 

20 Conjugative transfer is effected by an oripnoftransfer(oriT) and flanking 

genes (MOB A, B and C), and 1 5-25 genes, termed tra, encoding the structures and enzymes 
necessary for corrugation to occur. The transfer origin is defined as the dte required in ds for 
DNA transfer. Tra genes include tra A, B. C, D, E, F, G, H, I, J, K, L, M, N, P, Q, R, S. T, 
U, V, W, X, Y, Z, vir AB (alleles 1-1 1). C, D, E, G, IHF, and FmOP. Tra genes can be 

25 expressed mds or trans to oriT. Other ceBular enzymes, including those of the RecBCD 
pathway, RecA, SSB protein, DNA gyrase, DNA poU, and DNA ligase, are also involved in 
coiyugative transfer. RecE or recFpathw^ can substitute for RecBCD. 

One structural protein encoded by a tra gene is the sex pihis, a filamott 
constructed of an aggr^te of a angle polypqjtide protruding fix)m the ceU surface. The sex 

30 pilus binds to a polysaccharide on recipient ceUs and forms a conjugative bridge through which 
DNA can transfer. This process activates a site-spedfic nuclease encoded by a MOB gene, 
which specifically cleaves DNA to be transferred at oriT. The cleaved DNA is then threaded 
through the coiyugation bridge by the action of other tra enzymes. 



92 



wo 00/04190 PCTAJS99/15972.. 
Mobilizable vectors can exist in episomal form or int^rated into the 
chromosome. Episomal mobilizable vectors can be used to exchange fragments inserted mto 
the vectors between cells. Integrated mobilizable vectors can be used to mobilize adjacent 
genes from the chromosome. 

5 T . USE OF INTEGRATED MOBILIZABLE VECTORS TO PRQMnTF 

EXCHANGE OF GENOMIC DN A 

The F plasmid of E. coli integrates into the chromosome at high frequency and 
mobilizes genes unidirectional from the site of integration (Clewell, 1993, supra; Firth et al., 
in Escherichia coli and Salmonella Cellular and Molecular Biology 2, 2377-2401 (1996); 

10 Frost et al., Microbiol. Rev. 58, 162-210 (1994)). Other mobilizable vectors do not 

spontaneously integrate into a host chromosome at high efficiency, but can be induced to do 
so by growth under particular conditions (e.g., treatment with a mutagenic agent, growth at a 
nonpermissive temperature for plasmid replication). See Reimann & Haas in Bacterial 
Conjugation (ed. Clewell, Plenum Press, NY 1993), Ch. 6. Of particular interest is the IncP 

15 group of conjugal plasmids which are typified by their broad host range (Clewell, 1993^ supra. 

Donor "male" bacteria which bear a chromosomal insertion of a conjugal 
plasmid, such as the E. coli F factor can efiSciently donate chromosomal DNA to recqiient 
"female" enteric bacteria which lack F (F). Conjugal transfer from donor to recipient is 
initiated at oriT. Transfer of the mcked single strand to the recipient occurs in a 5' to 3' 

20 direction by a rolling drcle mechanisms whidi allows mobilization of tandem chromosomal 
copies. Upon entering the redpient, the donor strand is discontinuously replicated. The 
linear, ^gle-stranded donor DNA strand is a potrat substrate for initiation of recA-mediated 
homologous reconibination whhin the redpient. Recombination between the donor strand and 
recipient chromosomes can result in the inheritance of donor traits. Accordingly, strains which 

25 bear a chromosomal copy of F are designated Hfr (for high frequenqr of recombination) (Low, 
1996 in Escherichia coli and SabnoneUa Cellular and Molecular Biology Vol. 2, pp. 2402- 
2405; Sanderson, in Escherichia coli and Salmonella Celhilar and Molecular Biology 2, 
2406-2412 (1996)), 

The ability of strains with integrated mobilizable vector to transfer 

30 chromosomal DNA provides a rapid and effident means of exchan^g genetic material 
betwem a population of bacteria thereby allowmg combination of positive mutations and 
dilution of negative mutations. Such shuflBing methods typically start with a population of 
strains with an integrated mobilizable vector encompassing at least some genetic dh^ersity. 
The genetic diversity can be the result of natural variation, exposure to a mutagenic agent or 
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introduction of a fragment library. The population of cells is cultured without selection to 
allow genetic exchange, recombination and expression of recombinant genes. The cells are 
then screened or selected for evolution toward a desired property. The population surviving 
selection/screening can then be subject to a fiirther round of shuflBing by HFR-mediated 
S genetic exchange, or otherwise. 

The natural efficiency of Hfr and other strains with integrated mob vectors as 
recipients of conjugal transfer can be improved by several means. The relatively low recipient 
efficiaicy of natural HFR strains is attributable to the products of froSand fraf genes of F 
(Clewell, 1993, supra; Firth et al., 1996, supra. Frost et a!., 1994, supra, Achtman et al., J. 

10 Mol Biol. 138, 779-795 (1980). These products are localized to the inner and outer 

membranes of F^ strains, respectively, where they serve to mhibit redundant matings between 
two strains which are both capable of donating DNA. The effects of traS and traT, and 
cognate genes in other conjugal plasmids, can be elinunated by use of knockout cells incapable 
of expressing these enzymes or reduced by propagating cells on a carbon-limited source. 

15 (Peters et al., J. BacterioL, 178, 3037-3043 (1996)). 

In some methods, the starting population of cells has a mobilizable vector 
integrated at different genomic rites. Directional transfer from on7 typically results in more 
frequent inheritance of traits proximal to oriT. This is because mating pdrs are fragile and 
tend to dissodate (particulariy when in liquid medium) resulting in the interruption of transfer. 

20 In a population of cells having a mobilizable vector integrated at different sites, chromosomal 
exchange occurs in a more random ftsMon. Kits of Hfr strains are available from the E. coll 
Genetic Stock Center and the Salmonella Genetic Stock Centra (Frost et al., 1994, supra). 
Alternatively, a library of strains with oriTzX random sites and orientations can be produced 
' by ins^on mutagenesis using a transposon which bears oriT. The use of a transposon bearing 

25 an onT[e.g., the Tn5-or/T described by Yakobson EA, et al. J. Bacteriol . 1984 Oct; 160(1): 
451-453] provides a quick method of grafting such a library. Transfo* functions for 
mobilization from the transposdn-bome oriT sites are provided by a helper vector in tram. It 
is possible to generate similar genetic constructs using other sequences known to one of skill 
as well. 

30 In one aspect, a recuruve scheme for genomic shufiBing using In-oriT elements 

is provided. A prototrophic bacterial strain or set of related strains bearing a coiijugal plasmid, 
such as the F fertility &ctor or a member of the IncP group of broad host range plasmids is 
mutagenized and screened for the desired properties. Individuals with the desired properties 
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are mutagenized with a Tn-onT element and screened for acquisition of an auxotrophy (e.g., 
by repBca-plating to a minimal and complete media) resulting from insertion of the TnoriT 
element in any one of many biosynthetic gene scattered across the genome. The resulting 
auxotrophs are pooled and allowed to mate under conditions promoting male-to-male matings, 
5 e.g., during growth in close proximity on a fiker membrane. Note that transfer functions are 
provided by the helper conjugal plasmid present in the original strain set. Recombinant 
transconjugants are selected on mmimal medium and screened for further improvement. 

Optionally, strains bearing mtegrated mobilizable vectors are defwjtive in 
mismatch repair gene(s). Inheritance of donor traits which arise from sequence heterologies 

10 increases in strains lacking the methyl-directed mismatch repair system. Optionally, the gene 
products which decrease recombination eflSdency can be inhibited by small molecules. 

Intergenic conjugal transfer between species such as K colt and Salmonella 
typhmurium, which are 20% divergent at the DNA level, is also possible if the redpient strain 
is mutH, mutL or mutS (see Rayssiguier et al.. Nature 342, 396-401 (1989)). Such transfer 

15 can be used to obtain recombination at several points as shown by the following example. 

One example uses an S. typhimurium Hfr donor strain having markers thrSS7 at 
map position 0, pyrF2690 at 33 min, serAl 3 at 62 min and bfrKS at 43 min. MiitS +/-, F- E. 
coli recipient strains had markers pyrD68 at 21 min aroC3S5 at SI min, ilv3164 at 85 min and 
niutS21 5 at 59 min. The triauxotrophic £ typhimurium Hfr donor and isogenic nmtS+/- 

20 triauxotrophic E. coli recipient were inoculated into 3 ml of Lb broth and shaken at 37*C until 
fiilly grown. 100 (il of the donor and each redpimt were mixed in 10 ml fresh LB broth, and 
then deposited to a sterile MUlipore 0.45 \M HA filter using a Nalgene 250 ml reusable 
filtration device. The donor and recipients alone were similarly diluted and deposited to check 
for reversioa The filters with cells were placed oell-side-up on the surfiice of an LB agar plate 

25 which was incubated overnight at 37^C. The filters were removed wiA the aid of a sterile 
forceps and placed in a st^e 50 ml tube containing 5 ml of minimal salts broth. Vigorous 
vortexing was used to wash the cells fit)m the filters. 100 of mating nuxtures, as weU as 
donor and recipient controls were spread to LB for viable cell counts and minimal glucose 
supplemented with either two of the three recipirat requirfflients for single recombinant 

30 counts, one of the three requirements for double recombinant counts, or none of the three 
requirements for triple recombinant counts. The plates were incubated for 48 hr at 3T after 
which colonies were counted. 
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The data indicate that recombinants can be generated at reasonable frequencies 
20 using Hfir matings. Intergeneric recombination is enhanced 100-200 fold in a recipient that is 
defective methyl-KUrected mismatch repair. 

Frequencies are further enhanced by increasing the ratio of donor to recipient 
cells, or by repeatedly mating the original donor strains with the previously generated 
recombinant progeny. 

25 U. INTRODUCTION OF FRAGMENTS BY CONJUGATION 

Sobilizable vectors can also be used to transfer fragment bl}raries into cells to 

be evolved. This approach is particulariy usefiil in situations in which the cells to be evolved 

cannot be efficiently transformed directly with the fragment library but can undergo 

conjugation with primary cells that can be transformed with the fi^igment library. 

30 DNA fragments to be introduced into host cells encompasses diversity relative 

to the host cell genome. The diversity can be the result of natural diversity or mutagenesis. 
The DNA fragment library is cloned into a mobiHzable vector having an origm of transfer. 
Some such vectors also contain mob genes akhou^ akematively these fimctions can also be 
provided in trans. The vector should be capable of eflScient conjugal transfer between primary 

35 cells and the intended host cells. The vector should also confer a selectable phenotype. This 
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phenotype can be the same as the phenotype being evolved or can be conferred by a marker, 
such as a drug resistance marker. The vector should preferably allow self-elimination in the 
intended host cells thereby allowing sdection for ceUs in which a cloned fragment has 
undergone genetic exchange with a homologous host segment rather than duplication. Such 
5 can be achieved by use of vector lacking an origin of replication functional in the intendai host 
type or inclusion of a negative selection maricer in the vector. 

One suitable vector is the broad host range conjugation plasmid described by 
Simon et al., Bio/Techwiogy 1, 784-791 (1983); TrieuCuot et al.. Gene 102, 99-104 (1991); 
Bierman et al.. Gene 1 16, 43-49 (1 992). These plasmids can be transfbnned into E. coli and 

10 then force-mated into bacteria that are difficult or impossible to transform by chemical or 
electrical induction of competence. These plasmids contam the origin of the IncP plasmid, 
oriT. Mobilization functions are supplied in trans by chromosomally-integrated copies of the 
necessary genes. Conjugal transfer of DNA can in some cases be assisted by treatment of the 
recipient (if gram-positive) with sub-inhibitory corioentrations of penidUins (Trieu-Cuot et al., 

15 1993 FEMS MicrobioL Lett 109, 19-23). To increase diversity in populations, recursive 
conjugal mating prior to screening is performed. 

Cells that have undergone allelic exchange with library fragments can be 
screened or selected for evolution toward a desired phenotype. Subsequent rounds of 
recombination can be performed by repeating the coiyugal transfer step, the library of 

20 fragments can be fi^h or can be obtained from some (but not all) of the cells surviving a 

previous round of sdection/screening. Corgugation-mediated shuffling can be combined with 
other methods of shufOing. 

V. GENETIC EXCHAN GE PROMOTED BY TRANSDUCING PHAGE 
Phage transduction can include the transfer, from one cell to another, of 

25 nonviral genetic material within a viral coat (Masters, m Escherichia coli and Sabnomlla 
Celhtkar and Molecular Biology 2, 2421-2442 (1996). Perhaps the two best examples of 
gen^alized transducing phage are bacteriophages PI and P22 of £. coli and S. typhimuriwn, 
respectively. GeuCTalized transdudng bacteriophage particles are formed at a low frequency 
during lytic infection y/hen viral-gmome-sized, doubled-stranded fragments of host (which 

30 serves as donor) chromosomal DNA are packaged into phage heads. Promiscuous high 
transducing (HT) mutants of bactmophage P22 which efficiently package DNA with little 
sequence specificity have been isolated. Infection of a susceptible host results in a lysate in 
wfaidi up to 50% of the phage are transducing particles. Adsorption of the gen^alized 
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transducing partide to a susceptible recipient cell results in the injection of the donor 
chromosomal fragment. RecA-mediated homologous recombination following injection of the 
donor fragment can result in tiie inheritance of donor traits. Anotiier type of phage which 
achieves quasi random msertion of DNA into tiie host chromosome is Mu. For an overview of 
Mu biology, see, Groisman (1991) in Methods in Enzvmnlpp y v. 204. Mu can generate a 
variety of chromosomal rearrangements including deletions, inversions, duplications and 
tianspositions. In addition, elements which combine the features of P22 and Mu are available, 
including Mud-P22, which contains the ends of the Mu genome in place of the P22 att site and 
iittgoie. See, Berg, supra. 

Generalized transdudng phage can be used to exdiange genetic material 
between a population of cells encompassing genetic diversity and susceptible to infection 1^ 
tiie phage. Genetic diversity can be tiie result of natural variation between ceUs, induced 
mutation of cells or tiie introduction of fragment libraries into cells. DNA is tiien exchanged 
between ceDs by generalized transduction. If the phage does not cause lysis of cells, the entire 
population of cells can be propagated in the presence of phage. If Ae phage results in lytic 
infection, transduction is perftmned on a split pool baas. That is, tiie starting population of 
cells is divided into two. One subpopulation is used to pr^arelransduchig phage. The 
transdudng phage are thai infected into tiie other subpopulatioa PieferaUy, infection is 
peifuimed at hi^ mul^lidty of phage per cdl so that few cells remain umnfected. Cells 
surviving infection are propagated and screened or selected for evohition toward a deared 
property. The pool of cells surviving screenmg/selection can then be shufBed by a fbrflier 
round of generalized transduction or by otiier shufOmg methods. Recursive split pod 
tranduction is optionally performed prior to sdecticm to increase the diveraity of ai^ 
population to me screened. 

The effidenqr of the above metiiods can be increased by reducmg infe^on of 
cells by infectious (nontnmsdudng phage) and by reducing tysogen formation. The former can 
be adiieved by indusion of chelators of divalem cations, such as citiate and EGTA in cultiire 
media. Tafl defective transdudng pba^ can be used to allow only a angle round of infection. 
Divalent cations are required for phage absorption and tiie induaon of chdatiug agoits 
therefore provides a means of preventing unwanted infectioa Jntegrstion defective (ot/) 
derivatives of generalized tinnsdudng phage can be used to prevent lysogcn fonnation. In a 
fiirther variation, host cdls witii defects in mismatdi repair g«ie(s) can be used to increase 
recombination betweoi transduced DNA and genomic DNA. 
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1. Use of Locked in Prophages to Faci Utate DNA Shuffling ^ 
The use of a hybrid, mobile genetic element (locked-in prophages) as a means 

to &cilitate whole genome shufiBing of organisms using phage transduction as a means to 

transfer DNA from donor to recipimt is a prefored embodiment. One such element 

5 (Mud-P22) based on the temperate Sahnonella phage P22 has been described for use in 

genetic and physical mappmg of mutations. See, Youderian et al. (1988) Genetics 118:581- 

592, and Benson and Goldman (1992) ^JSac/OToi 174(5):1673-1681. Individual Mud-P22 

insertions package specific regions of the Sahnonella chromosome into phage P22 particles. 

Libraries of random Mud-P22 msertions can be readily isolated and induced to create pools of 

10 phage particles packaging random chromosomal DNA fragments. These phage particles can 
be used to infect new cells and transfer the DNA from the host into the recipient in the process 
of transduction. Alternatively, the packaged chromosomal DNA can be isolated and 
manipulated further by techniques such as DNA shuflQing or any other mutagenesis technique 
prior to being reintroduced into cells (especially recD cells for linear DNA) by transformation 

15 or electroporation, where they integrate into the chromosome. 

Either the intact transducing phage particles or isolated DNA can be subjected 
to a variety of mutagens prior to remtroduction into cells to enhance the mutation rate. 
Mutator cell lines such as mtOD can also be used for phage growth. Either method can be 
used recursively in a process to create genes or strains with desired properties. E. coli cells 

20 carrying a cosmid clone of Salmonella LPS genes are infectable by P22 phage. It is possible to 
develop similar genetic demits udng other combinations of transposable elmmts and 
bacteriophages or viruses as well. 

P22 is a lambdoid phage that packages its DNA into preassembled phage 
particles (heads) by a 'lieadful" mechanism. Packaging of phage DNA is initiated at a specific 

25 ^ ipac) and proceeds unidirectionalty along a linear, double stranded normally concatameric 
molecule. When the phage head is fiiU (M3 kb), the DNA strand is cleaved, and padkaging of 
the next phage head is initiated. Locked-in or exciaon-defective P22 prophages, howevv, 
initiate packaging at their pac site, and then proceed unidirectionally along the chromosome, 
packagmg successive headfiils of chromosomal DNA (ratfa^ than phage DNA). Whoi these 

30 transdudng phages infect new Sahnonella cells they iqect the chromosomal DNA from the 
original host into the rectpirat cell, where it can recombme into the chromosome by 
homologous recombination mating a chimeric chromosome. Upon infection of redpient cells 
at a high multiplicity of infection, recombmation can also occur between incoming transducing 
firagmfflits prior to recombination into the chromosome. 
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Integration of such locked-in P22 prophages at various sites in the 
chromosome allows flanking regions to be amplified and packaged into phage particles. The 
Mud-P22 mobfle genetic element contains an exdsion-^efective P22 prophage flanked by the 
ends of phage/transposon Mu. The entire Mud-P22 element can transpose to virtually any 
location in the chromosome or other episome (eg. F', BAG clone) when the Mu A and B 
proteins are provided in trans, 

A number of embodiments for this type of genetic element are available. In one 
example, the locked in prophage are used as generalized transducing phage to transfer random 
fragments of a donor chromosome mto a redpient. The Mud-P22 elCTient acts as a 
transposon when Mu A and B transposase' proteins are provided in trans and integrate copies 
ofitselfat random locations in the chromosome. In this way, a library of random 
chromosomal Mud-P22 insertions can be generated in a suitable host. When the Mud-P22 
prophages in this libraiy are induced, random fragm^its of dut>mosomal DNA will be 
packaged into phage particles. Whm these phages infect redpient cells, the chromosomal 
DNA is ixyected and can recombine into the chromosome of the redpimt. These redpient 
cells are soeraed for a desired property and cells showing improvement are then propagated. 
The process can be repeated, ance the Mud-P22 goetic element is not transferred to the 
redpient in this process. Infection at a Mgih muMplidty allows for multiple diromosomal 
fragments to be iiqected and recombined into the redpient chromosome. 

Locked in prophages can also be used as spedalized transdudng phage. 
Individual insertions near a gene of interest can be isolated fix>m a random assertion libraiy by 
a variety of methods. Induction oftfaesespedfic prophages results in packagmg of flankmg 
chromosomal DNA mduding the gme(8) of interest into phage partides. Infection of 
recipient ceUs with these phages and recombination of fte padcaged DNA mto the 
diromosome creates dmneric genes that can be screened for desired properties. Infection at a 
high multiplicity of infection can allow recomlnnation between incoming transdudng 
fi:agments prior to recombination into the chromosome. 

These spedalized transdudng phage can also be used to isolate large quantities 
of high quality DNA containing spedfic genes of intOTst without any prior knowledge of the 
DNA sequence. Cloning of spedfic graes is not reqmred. Insertion of such an dement neaity 
a biosynthetic operon for example allows for large amounts of DNA fi^m that operon to be 
isolated for use in DNA shuffling {in vitro and/or in vivo\ cloning, sequencing, or other uses 
as set forth herein. DNA isolated from similar insertions in other organisms containing 
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homologous operons are optionally mixed for use in &mily shufBing formats as described^ 
herein, in which homologous g«ies from different organisms (or different chromosomal 
locations within a angle species, or both). Alternatively, the transduced population is 
recursively transduced with pooled transducing phage or new transducing phage generated 
from the previously transduced cells. This can be carried out recursivdy to optimize the 
divOTity of the genes prior to shuflEling. 

Phage isolated fi'om insertions in a variety of strains or organisms containing 
homologous operons are optionally mixed and used to coinfect cells at a high MOI allowing 
for recombination between incoming transducing fi^igments prior to recombination into the 
chromosome. 

Locked in prophage are useful for mapping of genes, operons, and/or spedfic 
mutations with either desirable or undesirable phenotypes. Lodced-in prophages can also 
provide a means to s^arate and map multiple mutations in a given hiost. If one is looking for 
benefidal mutations outside a gene or operon of interest, then an unmodified gene or operon 
can be transduced into a mutagenized or shufiSed host then screened for the pres^ice of 
desired secondary mutations. Alternatively, the gene/operon of interest can be readily moved 
from a mutagoiized/shufiSed host into a dififerent badcground td so'een/select for 
modifications in thie gene/operon itself 

It is also possible to devdop similar genedc dements using other combinations 
of transposable dements and bacteriophages or viruses as wdl. Similar systrais are set up in 
other organisms, e.g., that do not allow replication of P22 or PI. Broad host range phages 
and transposable dements are espedaUyusefiil. Similar genetic dements are derived from 
otho: tenq>mte phages that also package by a headfiil mechanism. In general, these are the 
phages that are capable of generalized transduction. \^ruses infecting eukaryotic cdls may be 
adapted for sumlar purposes. Examples of generalized transducizig phages that are usefid are 
desc^ed in: Green et al., "Isolation and preliminary diaiacterization of lytic and lysogenic 
phages with \wde host range within the streptomyc^es", J. Gen Microbiol 13 1(9):2459-2465 
(1985); Studdard et al, '*Genome structure in Streptomyces spp.: adjacent genes on the S, 
coelicolor A3(2) linkage map have cotransdudble analogs in S. venezueiae'', J. Bacteriol 
169(8):3814-3816 (1987); Wang et al, "High frequency generalized transduction by miniMu 
plasmid phage", Genetics 116(2):201-206, (1987); Welker, N. E., "Transduction in 5aci7/t/5 
stearothermophilus'',J. Bacteriol, 176(1 1):3354-3359, (1988); Damns e/o/., "Mini-D3112 
bacteriophage transposable elements for genetic analysis of Pseudomonas aeruginosa^ J. 
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Bacteriol 171(7):3909-3916 (1989); Hugouviaix-Cotte-Pattat et al, "Expanded linkage map 
of Erwinia chrysanthemi strain 3937", Mol Microbiol 3(5):573-581, (1989); Ichige ei cd., 
"Establishment of gene transf^ systems for and construction of the genetic map of a marine 
Vibrio strain",^. Bacteriol 171(4): 1825-1 834 (1989); Muramatsu etal., "Two generalized 
5 transducing phages in Vibrio parahaemotyticus and Vibrio alginofyticus", Microbiol Immunol 
35(12): 1073-1084 (1991); Regue et al, ""A generalized transdudng bacteriophage for Serratia 
marcescens^^ Res Microbiol 42(l):23-27, (1991); Kiesel etal, "Phage Acml-mediated 
transduction in the facultatively methanol-utilimg Acetobacter methanolicus MB 58/4", 1 
Gen Virol 74(9): 1741-1745 (1993); Blahova et al., "Transduction of imipenem resistance by 
10 the phage F-1 1 6 from a nosocomial strain of Pseudomonas aeruginosa isolated in Slovakia", 
Acta Virol 38(5):247-250 (1994); Kidambi et al, "Evidwice for phage-mediated gene transfer 
among Ps^domonas aeruginosa strains on the phylloplane", Appl Environ Microbiol 
60:(2)496-500 (1994); Weiss e/o/., "Isolation and characterization of a generalized 
transdudng phage fox Xanthomonas campestris pv. campestris'', J. Bacteriol 176(11):3354- 
15 3359 (1994); Matsumoto et al. , "Clustering of the trp genes in Burkholderia (formerly 

Pseudomonas) cepaasi^FE^ Microbiol Lett 134(2-3):265*271 (1995); Schicldmaiere/a/., 
Trequmcy of gen^alized transducing phages in natural isolates of the Salmonella 
typhimurhm conqilex", Appl Environ Microbiol 61(4): 61(4): 1637*1640 (1995); Humphrey 
et aL, ^rification and characterization of VSH-1 , a generalized transducing bact^ophage of 
20 SerpuUna hyodysenteriae'\ J Bacteriol 179(2):323-329 (1997); Wlli et al., "Transduction of 
antibiotic re^stance markers among Acttnobacillus actinomycetemcomitans strains by 
temperate bacteriophages Aa phi 23", Cell Mol Life Sd 53(1 1-12):904-910 (1997); Jensen et 
dl, *Trevalence of broad-host-range lytic bact^ophages of Sphaerotilus natans, Escherichia 
coli, and Pseudomonas aeruginoscf"^ Appl Environ Microbiol 64(2):S75-S80 (1998)^ and 
25 Nedebnann a/., "Generalized transduction for genetic linkage analy 

transposon insertions in different SUqphylococcus q?idemidis strains", Zentiviralalbl Bakteriol 
287(1.2):85.92(1998). 

A Mud-Pl/Tn-Pl system comparable to Mid-P22 is devdoped using phage 
PI. Phage Fl has an advantage of padcaging much larger (M 10 kb) fragments per headfu^ 
30 Phage PI is currently used to create bacterial artifidal diromosomes or BACs. Pl-based 
BAG vectors are designed along these principles so that cloned DNA is packaged into phage 
paitides, rather than the current system, yMch requires DNA preparation from migle-copy 
episomes. This combines the advantages of both systems m having the genes cloned in a 
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Stable single-copy foimat, whilst allowing for amplification and spedfic packaging of clongd 
DNA upon induction of the prophage. 

W. RANDOM PLACEMENT OF GENES OR IMPROVED GENES 
IHROUGHOUT THE GENOME FOR OFITMIZ ATIQN OF GENE 

The placement and orientation of genes m a host diromosome (the ''context" of 
the gene in a chromosome) or episome has large efifects on gene expresaon and activity. 
Random mtegration of plasmid or other episomal sequences into a host chromosome by 
non-homologous recombination, foOowed by selection or screening for the desired phenotype, 
is a preferred way of identifing optimal chromosomal positions for expression of a target. This 
strategy is illustrated in Fig. 18. 

A variety of transposon mediated delivery systems can be employed to deliver 
genes of interest, dther Individual genes, genonuc libraries, or a library of shuffled gene(s) 
randomly throughout the genome of a host. Thus, in one preferred CTbodim^ the 
improvement of a cellular function is achieved by donmg a gene of interest, for example a 
gene encoding a desired metabolic pathway, within a transposon ddivety vehicle. 

Such transposon vdudes are available for both Gram-negative and 
Gram-po^ve bacteria. De Lorenzo and Timis (1994) Metiiods in Enzvmoloyv 235:385-404 
describe the analysis and construction of stable phenotypes in gram-negative Bacteria with 
Tn5-andTn 10-derivednunitransposons. KlcckheretaL (1991) Methods in En2ymolQgy 
204, chapter 7 describe uses of transposons such as TnlO, mcluding for use in gram positive 
bacteria. Petit c/ a/, (1990) Journal of Bacteriology 172(12):6736-6740 describe TnlO 
derived transposons active in Bacillus Subtilis. The transposon delivery vehicle is introduced 
into a cell population, which is then selected for recombinant cdls that have incorporated the 
transposon into the genome. 

The selection is typically by any of a variety of dmg resistant mark^s also 
carried within the transposon. The sdected subpopulation is screened for cells having 
in^jToved expression of the gene(s) of interest. Once cells harboring the genes of interest in 
the optimal location are isolated, the genes are amplified from within the graome using PCR, 
shuffled, and cloned back mto a similar transposon ddiveiy vehicle which contains a different 
selection marker within the transposon and lacks the transposon integrase gene. 

This shuffled library is then transformed back into the strain harboring the 
ori^nal transposon, and the cells are sdected for the presence of the new resistance markar 
and the loss of the previous selection marker. Sdected cdls are enriched for those that have 
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exchanged by homologous recombinadon the original transposon for the new transposon ^ 
carrying members of the shuflBed library. The survivmg cells are then screened for further 
improvements in the expression of the desired phenotype. The genes from the improved cells 
are then amplified by the PCR and shuffled again. This process is carried out recursively, 
oscillating each cycle between the different selection markers. Once the gene(s) of interest are 
optimized to a desired level, the fragment can be amplified and again randomly distributed 
throughout the genome as described above to identify the optimal location of the improved 
genes. 

Alternatively, the gene(s) conferring a desired property may not be known. In 
this case the DNA fi-agments cloned within' the transposon delivery vehicle could be a library 
of genomic fragments originating from a population of cells derived fix>m one or more strains 
having the desired property(ies). The library is ddivered to a popuhtion of cells derived from 
one or more strains having or lacking the desired prop»ty(ies) and ceUs incorporating tiie 
transposon are sdected. The surviving cdls are then screened for acquisition or improvement 
ofthede^red property. The fragments contained within the surviving celk are amplified by 
PCR and then doned as a pool into a ^mSar transposon delivery vector harboring a ^(^ent 
selection marker from the first delivery vector. This library is then deliv^^ to the pool of 
surviving cells, and the population hawig acquired the new sdective maricer is sdected. The 
sdected cells are then screened for fiuther acqui^on or inqm>vanent of the desired property. 
In tins way the dififisarent possible combinations of genes confming or improving a desired 
phenotype are e}q>lored in a combinatorial fashion. This process is carried out repetitivdy 
with eadi new cyde raplojing an additional sdection marker. Alternatively, PCR fragments 
are doned into a pool of transposon vectors having (Ufiferent selective markers. These are 
ddivered to cdls and sdected for 1, Z 3, or more markers. 

AltOTiativdy, the amplified firagments fipom each in^roved cell are shuffled 
tndq)endentiy. The shuffled libraries are then doned back into a transposon ddiv^ vehicle 
similar to the ori^nal vector but containing a diflfer^t selection marker and lackir^ the 
transposase gene. Sdection is then for acquisition of the new marker and loss of the previous 
marker. Selected cdls are raridied for those incorporating the shuffled variants of the 
amplified genes by homologous recombination. This process is carried out recursively, 
oscillating each qrcle between the two sdective markers. 
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X. IMPROVEMENT OF OVEREXPRESSED GENE S FOR pF.5;mFn 
PHENOTYPE 

The improvemeat of a cellular property or phenotype is often enhanoed by 
increasing the copy number or expres^on of gene(s) participating in the expression of that 
propoty. Genes that have such an effect on a desired property can also be improved by DNA 
shuffling to have a ^milar effect. A genomic DNA library is cloned into an overe9q}ression 
vector and transformed into a target cell population such that the genonuc fragments are 
higjhly expressed in celts sdected for the presence of the overexpression vector. The sdected 
cdls are then screened for improvement of a desired property. The overe7q}ression vector 
from the improved cells are isolated'and the cloned genomic fragments shuf9ed. The genomic 
fragment carried in the vector from each improved isolate is shuffled independmtly or with 
idmtified homologous genes (fiuiuly shufiBing). The shuffled libraries are then delivered back 
to a population of cells and the selected transformants rescreened for fiirther improvements in 
the desked prop^ty. This shufflin^screening process is cycled recursively until the desired 
property has been optimized to the desired level. 

As stated above, gene dosage can greatly enhance a desired cellular property. 
One method of increasing gene copy number of unknovm genes is uang a method of random 
amplification {see also, Mavmgui et. al. (1997) Nature Biotech, 15, 564). In this method, a 
genomic library is cloned mto a suidde vector containing a sdective marker that also at higher 
dosage provides an enhanced phenotype. An exanq^le of such a marker is the kanamydn 
resistance gene. At successively higher copy number, resistance to successively higher levels 
of kanamydn is achieved. The genomic library is delivered to a target cell by any of a variety 
of methods including transformation, transduction, conjugation, etc. Cells that have 
incorporated the vector into the chromosome by homologous recombination between the 
vector and chromosomal copies of the cloned genes can be sdected by requhing expression of 
the selection marker under conditions where the vector does not replicate. This recombination 
event results in the duplication of the cloned DNA fragment in the host chromosome with a 
copy of the vector and selection marker separating the two copies. The population of 
surviving cells are screened for improvement of a desired cellular property resuWng form the 
gene duplication event. Further gene duplication events resulting in additional copies of the 
original cloned DNA fragments can be generated by frirth^ propagating the cdls under 
successively more stringent selective conditions i.e. increased concentrations of kanamydn. In 
this case sdecdon reqmres increased copies of the selective marker, but increased copies of 
the desired gene fragment is also concomitant. Surviving cdls are fiirther screened for an 
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improvement in the desired phenotype. The resulting population of cells likely resulted in the 
amplification of difierent genes since often many genes eflfect a given phenotype. To generate 
a library of the possible combinations of these genes, the original selected library showing 
phenotypic unprovements are recombined, using the methods described herdn, eg., protoplast 
fusion, split pool transduction, transformation, conjugation, etc. 

The recombined cells are selected for increased expression of the selective 
marker. Survivors are enridied for cells having incorporated additional copies of the vector 
sequence by homologous recombination, and these cells will be enriched for those having 
combined duplications of dififerent genes. In other words, the duplication fix)m one cell of 
enhanced phenotype becomes combined with the duplication of another cell of enhanced 
phenotype. These survivors are sa:eened for fiirther improvements in the dedred phenotype. 
This procedure is repeated recursivdy until the desired level of phenotypic e3q)res^on is 
achieved. 

Ahematively, genes that have been identified or are suspected as being 
benefidal in increased copy number are cloned in tandem into appropriate plasmid vectors. 
These vectors are then transformed and propagated in an appropriate host organism. 
Plasnud-plasmid recombination between the doned gene finagmmts resuh in further 
duplication of the genes. Resolution of tiie plasmid doublet can result in. the imeven 
distribution of tiie gene copies, with some plasmids having additional gene copies and others 
having fewer gene copies. Cells carrying this distribution of plasmids are then screened for an 
improvemCTt in the phenotype effected by the gene duplications. 

In summary, a method of selecting for increased copy number of a nuclac acid 
sequence by the above procedure is provided. In the method, a genomic library in a suidde 
vector comprising a dose-senshhre selectable maik^ is provided, as noted above. The 
genomic library is transduced into a population of target cells. The target cells are selected in 
a population of target cdls for increasing doses of the sdectable marker under conditions in 
which the suidde vector does not replicate q)isomally. A phirality of target cells are sdected 
for the deared ph^otype, recombined and reselected. The process is recursh^ely repeated, if 
desired, until the d^ired ph^otype is obtained. 

Y. STRATEGffiS FOR IMPROVING GENOMIC SHUFFLING VIA 

TRANSFORMATION OF LINEAR DNA FRAGMENTS 

Wild-type mmibers of the &iterobacteriiaceae (e.g., Escherichia col/) are 

typically re^stant to graetic exchange following transformation of linear DNA moleoiles. 

This is due, at least in part, to the Exonuclease V (Exo V) activity of the RecBCD holoenzyme 
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which rapidly degrades linear DNA molecules following transformation. Production of Exp V 
has been traced to the recD gene, which encodes the D subunit of the holoenzyme. As 
demonstrated by Russel et al. (1989) Journal of Bacteriology 2609-2613, homologous 
recombination between a transformed linear donor DNA molecule and the chromosome of 
recipient is readily detected in a strains bearing a loss of function mutation in a recD mutant. 
The use of recD strains provides a simple means for genomic shuffling of the 
Enterobacieriaceae. For example^ a bacterial strain or set of related strains bearing a recD 
null mutation (e.g., the K coli recD/P(?3::mim-Tet allele) is mutagenized and screoied for the 
deared properties. In a split-pool fashion. Chromosomal DNA prepared on one aliquot could 
be used to transform (e.g., via electroporation or chemically induced competence) the second 
aliquot. The resulting transfoiroants are then screened for improvement, or recursively 
transformed prior to screenmg. 

The use of RecE/ recT as described stq>ra, can improve homologous 
recombination of linear DNA fragments. 

The RecBCD holoezyme plays an important role in imtiation of 
RecA-dependent homologous recombination. Upon recognizing a dsDNA md, the RecBCD 
en^e unwinds and d^rades the DNA asymmetrically in a 5' to 3' direction until it 
encounters a chi (or *7C>ate (consensus 5'-GCTGGTGG-3') which attenuates the nuclease 
activity. This results in the generation of a ssDNA tOTninating near the c site with a 3'-ssDNA 
tail that is preferred for RecA loading and subsequent invasion of dsDNA for homologous 
recombination. Accordingly, preprocessing of transfomung fiagments with a S' to 3' specific 
ssDNA Exonuclease^ such as Lamda (X) exonudease (available, e.g., firom Boeringer 
Mannhehn) prior to transformation may serve to stimulate homologous recombination in recD' 
strain by providing ssDNA invasive end for RecA loading and subsequent strand invasion. 

The adcUtion of DNA sequence encoding chi-sites (consmsus 
5'-GCrGGT(jG-3*) to DNA fi-agments can serve to botii attenuate Exonudease V activity 
and stimulate homologous recombination, therd)y obviating the need for a recD mutation (see 
also, Kowalczykowski, et al (1994) ^'Biochemistry of homologous recombination in 
Escherichia coli,^ Microbiol Rev. 58:401-465 and Jessen, et al, (1998) 'Modification of 
bacterial artifidal chromosomes through Chi-stimulated homologous recombination and its 
application in zebrafish transgenesis." Proc. Natl Acad Sci. 95:5121-5126). 

Chi sites are optionally included in linkers ligated to the ends of transforming 
fiiagments or incorporated into the external primers used to gen^ate DNA fi-agments to be 
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transformed. The use of recombination-stimulatory sequences such as chi is a generally usrful 
approach for evolution of a broad range of cell types by fragment transformation. 

Methods to inhibit or mutate analogs of Exo V or other nucleases (such as, 
Exonucleases I (endAJ), HI {nth\ IV {nfo\ VII, and Vin of E, coli) is similarly useful. 
Inhibition or elimination of nucleases, or modification of ends of transforming DNA fragments 
to render them resistant to exonuclease activity has applications in evolution of a broad range 
of cell types. 

Z SHUm^ING TO QPmgZE UNKNOWN INT^ 

Many observed traits are the result of complex interactions of multiple genes or 

gene products. Most such interactions are still uncharacterized. Accordingly, it is often 

unclear which genes need to be optimized to achieve a desired trait, even if some of the genes 

contributing to the trait are known. 

This lack of characterization is not an issue during DNA shuffling, which 

produces solutions that optimize whatever is selected for. An alternative approach, which has 

the potential to solve not only this problem, but also antidpated future rate limiting &ctors, is 

complementation by ovc^^ression of unknown genomic sequences. 

A library of genomic DNA is first made as desoibed, supra. This is 

transfi>rmed into the cell to be optimized and transformants are screened for increases in a 

deared property. Genomic fi-agments which result in an improved property are evolved by 

DNA shuffling to fiirther increase fhdr beneficial efifect. Tlus approach requires no sequence 

infi:>nnation, nor any knowledge or assunq>tions about the nature of protdn or pathway 

interactions, or even of what steps are rate -limiting; it relies only on detection of the desired 

pbenotype. This sort of random cloning and subsequent evolution by DNA shuffling of 

positively interacting goiomic sequences is extrmidy powerful and generic. A variety of 

sources of gnomic DNA are used, from isogenic strains to more distantiy related spedes with 

potentially desirable properties. In addition, the tedmique is ^licable to any cdl for wUch 

the molecular biology basics of transformation and cloning vectors are available, and for any 

property which can be assayed (preferably in a high-throughput format). Attemativdy, once 

optimized, the evolved DNA can be returned to the chromosome by homologous 

recombination or randomly by phage mediated ^e-spedfic recombination. 

AA. HOMOLOGOUS RECOMBINATION WmnN THE CHROMOSOME 
Homologous recombination within the chromosome is used to chmnvent the 

limitations of plasmid based evolution and size restrictions. The strategy is similar to that 
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described above for shuffling genes within their chromosomal context, except that no in v{tro 
shuffling occurs. Instead, the parent strain is treated with mutagens such as ultraviolet light or 
nitrosoguanidine, and improvoi mutants are selected. The improved mutants are pooled and 
split. Half of the pool is used to generate random genomic fragments for cloning into a 
5 homologous recombination vector. Additional genomic fragments are optionally derived from 
related species with desirable properties. The cloned genomic fragments are homologously 
recombined into the genomes of the remainmg half of the mutant pool, and variants with 
unproved properties are selected. These are subjected to a further round of mutagenesis, 
selection and recombination. Again this process is entirely generic for the improvement of any 
10 i^ole cell biocatalyst for which a recombuiation vector and an assay can be developed. Here 
again, it should be noted that recombinadon can be performed recursively prior to screening. 

BB. METHODS FOR R ECURSIVE SEQUENCE RECQMBINATTON 
Some formats and examples for recursive sequence recombination, sometimes 
referred to as DNA shuffling or molecular breedmg, have been described by the present 

15 inventors and co-workers in copending application, attorney docket no, 16528A-014612, filed 
March 25, 1996, PCT/US95/02126 filed February 17, 1995 (pubUshed as WO 95/22625); 
Stemmer, Science 270, 1510 (1995); Stemmir et al.. Gene, 164, 49-53 (1995); Stemmer, 
Bio/Technology, 13, 549-553 (1995); Stemmer, Proc. NatL Acad. ScL USA 91, 10747-10751 
(1994); Stemmer, Nature 370, 389-391 (1994); Crameri et al., Nature Medicine, 2(1): 1-3, 

20 (1996), and Crameri et al., Nature Biotechnology 14, 315-319 (1996) (each of which is 
mcorporated by reference in its entirety for all purposes). 

As shown in Figs. 16 and 17, DNA Shuffling provides most rapid technology 
for evolution of complex new fimctions. As shown in Fig 16, panel (A), recombination m 
DNA shuffling achieves accumulation of multiple benefidal mutations m a few cycles. In 

25 contrast, because of the high fi^uwicy of delelwious mutations relative to beneficial ones, 
iterative point mutation must build benefidal mutations one at a time, and consequently 
requires many cycles to reach the same point. As shown in Fig. 16 panel B, rather than a 
simple linear sequence of mutation accumulation, DNA shuffling is a paralld process v^o-e 
multiple problems may be solved mdependentiy, and then combined. 

30 1. /r? F/7>-Q Formats 

One format for shuffliag in vitro is illustrated in Fig. 1 . The initial substrates 
for recombination are a pool of related sequences. The Xs in Fig. 1, pand A, show where the 
sequences diverge. The sequences can be DNA or RNA and can be of various lengflis 
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depending on the size of the gene or DNA fragment to be recombined or reassembled. 
Preferably the sequences are from 50 bp to 50 kb. 

The pool of related substrates are converted into ovM-lapping fragments, e.g., 
from about 5 bp to 5 kb or more, as shown in Fig. 1, panel B. Often, the size of the fragments 
is from about 1 0 bp to 1000 bp, and sometimes the size of the DNA fragments is from about 
100 bp to 500 bp. The conversion can be effected by a number of diflFerent methods, such as 
DNasel or RNase digestion, random shearing or partial restriction enzyme digestion. 
Alternatively, the conversion of substrates to fragments can be effected by incomplete PCR 
amplification of substrates or PCR primed from a single primer. Alternatively, appropriate 
single-stranded fragmmts can be generated on a nucleic add synthesizer. The concentration 
of nucleic add fragments of a particular length and sequence is often less than 0. 1 % or 1 % by 
weight of the total nucldc add. The number of different specific nucldc add fragments in the 
mbcture is usually at least about 100, SOO or 1000. 

The nuxed population of nucldc add firagments are converted to at least 
partially single-stranded form. Conversion can be effected by heating to about 80 X to 100 
"^C, more preferably from 90 X to 96 ''C, to form sii^e-stranded nucldc add fitigments and 
then reannealing. Conversion can also be effected by treatment with dn^e-stranded DNA 
binding protem or recA protdn. Single-stranded nuddc add fi-agments having regions of 
sequence identity with other dngle-stranded nucldc add fi-agments can then be reannealed by 
cooling to 4'C to 75T, and preferably from 40 °C to 65 Renaturation can be accderated 
by the addition of polyethylene glycol (PEG), other volume-excluding reagents or salt. The 
salt concentration is preferably from 0 mM to 200 mM, more preferably the salt concentration 
is from 10 mM to 100 mM. ThesaltmaybeKClorNaO. The concentration of PEG is 
preferably from 0% to 20%, more preferably from 5% to 10%. The fingments that reanneal 
can be from different substrates as shown in Fig. 1, pand C. The annealed nuddc add 
fragments are incubated in the presence of a nucldc add polymerase, sudi as Taq or Klenow, 
or proofreading polymerases^ such as pfti or pwo, and dNTPs (i.e. dATP, dCTP, dGTP and 
dTTP). If regions of sequence identity are large, Taq polymerase can be used with an 
annealing temperature of between 4S-6S'^C. If the areas of identity are small, Klenow 
polymwase can be used with an annealing temperature of between 20-3 0**C (Stemmer, Proc, 
Natl Acad ScL USA (1 994), supra). The polymerase can be added to the random nucldc 
add fragments prior to annealing, simultaneously with anneahng or after annealing. 



110 



wo 00/04190 PCT/US99/1 5972 . . 

The process of denaturation, renaturation and incubation in the presence of^, 
polymerase of overlapping fragments to generate a collection of polynucleotides containing 
different permutations of fragments is sometimes referred to as shuffling of the nucleic acid in 
vitro. This cycle is repeated fiDr a desffed number of times. Preferably the cycle is repeated 
from 2 to 100 times, more preferably the sequence is rq^eated from 10 to 40 times. The 
resulting nucleic acids are a family of double-stranded polynucleotides of from about SO bp to 
about 100 kb, prefe^ly from 500 bp to 50 kb, as shown in Fig. 1, pand D. The population 
represents variants of the starting substrates showing substantial sequence identity thereto but 
also diverging at several positions. The population has many more members than the starting 
substrates. The population of fragments resulting from shufiling is used to transform host 
cells, optionally after cloning into a vector. 

In a variation of in vitro shuffling, subsequences of recombination substrates 
can be generated by an^lifying the frdl-length sequences under conditions which produce a 
substantial fraction, typically at least 20 percent or more, of incompletely tended 
amplification products. The amplification products, including the incompletely e}ctended 
amplification products are denatured and subjected to at least one additional cycle of 
reannealihg and amplification. This variation, in ^ch at least one cycle of reannealing and 
amplification provides a substantial fraction of incompletely extended products, is termed 
"stuttering." In the subsequent ampUfication round, the incompletely extended products 
reanneal to and prime extension on differmt sequence-rdated template spedes. 

In a fiirther variation, a mbcture of fragments is spiked with one or more 
oligonucleotides. The oligonucleotides can be deagned to tnchide prediaracterized mutations 
of a wildtype sequence, or sites of natural variations between in^viduals or spedes. The 
oligonudeotides also include suffident sequence or stnictural homology flankmg such 
nnitations or variations to allow annealing with the wildtype firagments. Some 
oligonucleotides may be random, sequences. Annealing temperatures can be adjusted 
depending on the length of homology. 

In a fiirther variation, recombination occurs in at least one cycle by template 
switdiing, such as when a DNA fi^igment derived from one traiplate primes on the 
homologous position of a related but different template. Template switching can be induced 
by addition of recA, radS 1, rad55, rad57 or other polymerases (e.g., viral polymerases, reverse 
transcriptase) to the amplification mbcture. Template switching can also be increased by 
inCTeasing the DNA template concentration. 

Ill 
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In a further variation, at least one cycle of amplification can be conducted us^g 
a collection of overlapping single-stranded DNA fragments of related sequrace, and different 
lengths. Fragments can be prepared using a single stranded DNA phage, such as Ml 3 . Each 
fragment can hybridize to and prime polynucleotide chain extension of a second fragment from 
the collection, thus forming sequence«recombined polynucleotides. In a further variation, 
ssDNA fragments of variable length can be generated from a single primer by Vent or other 
DNA polymerase on a first DNA template. The single stranded DNA fragments are used as 
primers for a second, Kunkel-type template, consisting of a uracil-containing circular ssDNA. 
This results in multiple substitutions of the first template into the second. See Levichkin et al., 
MoL Biology 29, 572-577 (1995). 

2. /ff^vo Formats 
(a) . Plasmid-Plasmid Recombination 

The initial substrates for recombination are a collection of polynucleotides 
comprising variant forms of a gene. The variant fi3rms often show substantial sequence 
identity to each other suffident to allow homologous recombination between substrates. The 
divmtty between the polynucleotides can be natural (e.g., allelic or spedes variants), mduced 
(e.g., error-prone PCR), or the result of in vitro recombinatioa Diversity can also result from 
resyntheaang genes encoding natural protdns with alternative and/or mixed codon usage. 
There should be at least suffident div^ty between substrates that recombination can 
goierate more diverse products than there are starting materials. There must be at least two 
substrates differing in at least two positions. However, conunonly a library of substrates of 
lO^-lo' members is employed. The degree of diversity depends on the length of the substrate 
being recombined and the extent of the fiinctional change to be evolved. Diversity at between 
0.1-50% ofpoations is typical. The diverse substrates are incorporated into plasmids. The 
plasmids are often standard cloning vectors, e.g., bacterial multicopy plasmids. However, in 
some methods to be described below, the plasmids indude mobilization fimctions. The 
substrates can be incorporated hxto tiie same or differmt plasmids. Often at least two <Merent 
types of plasmid having different types of sdection marker are used to allow sdection for ceUs 
containing at least two types of vector. Also, where different types of plasmid are employed, 
the different plasmids can come from two distinct incompatibility groups to allow stable co- 
existence of two different plasmids mthin the cdl. Nevertheless, plasmids from the same 
incompatibility group can stiU co-exist witiun the same cell for sufiSdent time to aDow 
homologous recombination to occur. 
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Plasmids containing diverse substrates are initially introduced into prokaiyotic 
or oikaryotic cells by any transfection methods (e.g., chemical transformation, natural 
competence, dectroporation, viral transduction or biolistics). Often, the plasmids are present 
at or near saturating concentration (with respect to maximum transfection capacity) to 
increase the probability of more than one plasmid entering the same cell. The plasmids 
containing the various substrates can be transfected simultaneously or m multiple rounds. For 
example, in the latter approach cells can be transfected with a first aliquot of plasmid, 
transfectants selected and propagated, and then infected with a second aliquot of plasmid. 

Having mtroduced the plasmids into cells, recombination between substrates to 
generate recombinant genes occurs within cells containing multiple different plasmids merely 
by propagating in the cells. However, cells that receive only one plasmid are unable to 
participate in recombination and the potential contribution of substrates on such plasmids to 
evolution is not fiilly exploited (although these plasmids may contribute to some extent if they 
are propagated in mutator cells or otherwise accumulate point mutations (i.e., by ultraviolet 
radiation treatment). The rate of evolution can be increased by allowing all substrates to 
participate in recombination. Such can be adiieved by subjectmg transected cells to 
electroporation. The conditions for electroporation are the same as those conventionally used 
for introducing exogenous DNA into cells (e.g., 1,000-2,500 volts, 400 pF and a 1-2 mM 
gap). Under these conditions, plasmids are exchanged between cells allowing all substrates to 
participate in recombination. In addition the products of recombination can undergo fiirth^ 
rounds of recombination with each other or with the original substrate. The rate of evolution 
can also be increased by use of conjugative transfer. Conjugative transfer systems are known 
in many bacteria (EL coli, P, aeruginosa, S. pneumoniae, and H. influenzae) and can also be 
used to transfer DNA between bacteria and yeast or between bacteria and manmialian cells. 

To exploit conjugative transfer, substrates are cloned into plasmids hawig 
MOB genes, and tra genes are also provided in ds or in trans to the MOB gmes. The effect 
of conjugative transfix is very similar to electroporation in that it allows plasimds to move 
between cells and allows recombination between any substrate and the products of previous 
recombination to occur merely by propagating the culture. The details of how conjugative 
transfer is exploited in these vectors are discussed in more detail below. The rate of evolution 
can also be increased by fusing protoplasts of cells to induce exchange of plasmids or 
chromosomes. Fusion can be induced by chemical agents, such as PEG, or viruses or viral 
protdns, such as infiuoiza virus hemagglutinin, HSV-1 gB and gD. The rate of evolution can 



113 



wo 00/04190 PCT/US99/15972.. 
also be increased by use of mutator host cells (e.g., Mut L, S, D, T, H and Ataxia 
telangiectasia human cell lines). 

Alternatively, plasraids can be propagated together to encourage recombination 
, then isolated, pooled, and reintroduced into cells. The combination of plasmids is diSerent in 
each cell and recombination further increases the sequence diversity within the population. 
This is optionally carried out recursively until the desired level of diversity is achieved. The 
population is then screened and selected and this process optionally repeated with any selected 
cells/plasmids. 

The time for which cells are propagated and recombination is allowed to occur, 
of course, varies with the cell type but is generally not critical, because even a small degree of 
recombination can substantially increase diversity relative to the starting materials. Cells 
bearing plasmids containing recombined genes are subject to screening or selection for a 
desired fimction. For example, if the substrate being evolved contains a drug resistance gene, 
one selects for drug resistance. Cells surviving screening or selection can be subjected to one 
or more rounds of screeiung/selection followed by recombination or can be subjected directiy 
to an additional round of recombination. 

The next round of recombination can be achieved by several dififerent formats 
independentiy of the previous round. For example, a further round of recombination can be 
effected amply by resuming the electroporation or conjugation-mediated intercellular transfer 
of plasmids desoibed above. Alternatively, a fresh substrate or substrates, tiie same or 
diffident from previous substrates, can be transfected into cells surviving selection/screening. 
Optionally, the new substrates are included in plasmid vectors bearing a different sdective 
marker and/or from a different inconq>atibility group than the original plasmids. A^ a fiuther 
alternative, cells surviving selection/screening can be subdivided into two subpopulations, and 
plasmid DNA from one subpopulation transfected into the other, where the substrates from 
the plasmids from the two subpopulations undergo a further round of recombination. In either 
of the latter two options, the rate of evohition can be increased by employing DNA extraction, 
dectroporation, conjugation or mutator cells, as desoibed above. In a still fiinher variation, 
DNA from cdls surxdving screening/sdection can be extracted and subjected to in vitro DNA 
5hu£9ing. 

After the second round of recombination, a second round of screening/selection 
is performed, preferably under conditions of increased stringency. If de^ed, frirther rounds of 
recombination and selection/screening can be performed u^ng the same strategy as for the 
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second round. With successive rounds of recombination and selection/screening, the surviving 
recombined substrates evolve toward acquisition of a desired phenotype. Typically, in this and 
other methods of recursive recombination, the final product of recombination that has acquired 
the desired phenotype differs from starting substrates at 0.1%*2S% of positions and has 
evolved at a rate orders of magnitude in excess (e.g., by at least 10-fold, 100-fold, 1000-fold, 
or 10,000 fold) of the rate of naturally acquired mutation of about 1 mutation per 10"^ 
positions per generation (see Anderson & Hughes, Proc. Nail Acad, Set USA 93, 906-907 
(1996)). As with other techniques herein, recombination steps can be performed recursively to 
enhance diversity prior to screening. In addition, the entire process can be performed in a 
recursive manner to generate desired organisms, clones or nucleic adds. 

3 . Viius-Plasmid Recombination 
The strategy used for plasmid-plasmid recombination can also be used for 

virus-plasmid recombination; usually, phage-plasmid recombination. However, some 

additional comments particular to the use of viruses are appropriate. The faiitial substrates for 

recombination are cloned into both plasmid and viral vectors. It is usually not critical which 

substrate(s) are inserted into the viral vector and which into the plasmid, although usually the 

vind vector should contain different substrate(s) fi^om the plasmid. As before, the plasmid 

(and the virus) typically contains a selective marker. The plasnud and viral vectors can both be 

introduced into cells by transfection as described above. However, a more effident procedure 

is to transform the ceDs with plasmid, sdect transformants and infect the transformants with a 

vims. Because the efiBdency of infection of many viruses approaches 100% of cells, most 

cells transformed and infected by this route contain both a plasmid and vims bearing diffmnt 

substrates. 

Homologous recombination occurs between plasmid and virus generating both 
recombined plasnuds and recombined vims. For some vimses, such as fikunentous phage, in 
wfaidi intracellular DKA exists in both double-stranded and single-stranded forms, both can 
partidpate in recombination. Provided that the vims is not one that rapidly kills cells, 
•recombination can be augmented by use of electroporation or coqugation to transfer plasmids 
between cells. Recombination can also be augmented for some types of vims by allowing the 
progeny virus from one cdl to rdnfect other cells. For some types of vims, vims mfected- 
cdls show resistance to superinfection. However, such resistance can be overcome by 
infecting at high multiplicity and/or u^g mutant strains of the vims in wtuch resistance to 
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Riiperinfecrion is reduced. Recursive infection and transformation prior to screening can be 
p^ormed to enhance diversity. 

The result of infecting plasmid-containmg cells with virus depends on the 
nature of the virus. Some viruses, such as filamentous phage, stably exist with a plasmid in the 
S cell and also ^ctrude progeny phage from the cell. Other viruses, such as lambda having a 
cosmid genome, stably exist in a cell like plasmids without producing progeny virions. Other 
viruses, such as the T-phage and lytic lambda, undergo recombination with the plasmid but 
ultimately kill the host cell and destroy plasmid DNA. For viruses that infect cells without 
killing the host, cells containing recombinant plasmids and virus can be screened/selected using 
10 the same approach as for plasmid-plasmid re(k)mbinatioa Progeny virus extruded by cells 
surviving selection/screeaing can also be collected and used as substrates ia subsequent rounds 
of recombination. For viruses that kill their host cells, recombinant genes resulting fi'om 
recombination reside only in the progeny virus. If the screening or selective assay requires 
expression of recombinant genes in a cell, fhe recombinant genes should be transferred from 
IS the progeny virus to another vector, e.g., a plasmid vector, and retransfected into cdls before 
selection/screening is performed. 

For filamentous phage, the products of recombination are present in both cells 
surviving recoxhbinadon and in phage detruded from these cells. The dual source of 
recombinant products provides some addifioiud options relative to the plasmid-plasmid 
20 recombination. For example, DNA can be isolated from phage particles for use in a round of 
in vitro recombination. Ahematively, the progeny phage can be used to transfect or infect 
cdls surviving a previous round of screwing/selection, or firesh cells transfected with fresh 
substrates for recomibination. 

4. Virus-Virus Recombination 
25 The prindples described for plasmid-plasmid and plasndd-viral recombination 

can be applied to virus-virus reconibination with a few modifications. The initial substrates for 

recombination are doned into a viral vector. Usually, the same vector is used for all 

substrates. Preferably, the virus is one that, naturally or as a result of mutation, does not kill 

cells. After insertion, some viral genomes can be packaged i/iW/^o. The packaged viruses are 

30 used to infect cdls at high muMpUdty sudi that there is a high probability that a cdl recdves 

multiple viruses bearing (UflEerent substrates. 

After the initial round of infection, subsequent stq)s depend on the nature of 

infection as discussed m the previous section. For example^ if the viruses have phagemid 
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genomes such as lambda cosmids or Ml 3, Fl or Fd phagemids, the phagemids behave as 
plasmids within the cell and undergo recombination simply by propagating in the cells. 
Recombination and sequence diversity can be enhanced by electroporation of cdls. Following 
selection/screening, cosmids containing recombinant genes can be recovered from surviving 
cells (e.g., by heat induction of a cos' lysogenic host cell), repackaged in vitro, and used to 
infect fresh cells at high multiplidty for a further round of recombination. 

If the viruses are filamentous phage, recombmation of replicating form DNA 
occurs by propagating the culture of infected cells. Selection/screening identifies colonies of 
cells containing viral vectors having recombinant genes wth improved properties, together 
with phage extruded from such cells. Subsequent options are essentially the same as for 
plasmid-viral recombination. 

5 . Chromosome-PIasiTud Recombination 
This format can be used to evolve both the chromosomal and plasmid-bome 

substrates. The format is particularly useful in situations in which many chromosomal genes 

contribute to a phenotype or one does not know the exact location of the chromosomal 

gene(s) to be evolved. The initial substrates for recombination are cloned into a plasmid 

vector. If the chromosomal gene(s) to be evolved are known, the substrates constitute a 

fiunily of sequences showing a high degree of sequence identity but some divergence from the 

chromosomal grae. If the chromosomal genes to be evolved have not been located, the initial 

substrates usually constitute a library of DNA segments of which only a small number show 

sequence identity to the gene or gene(s) to be evolved. Divergence between plasmid-bome 

substrate and the chromosomal gme{s) can be induced by mutagenesis or by obtaining the 

pla5mid-4)ome substrates from a diffmnt spedes than that of the cells bearing the 

chromosome. 

The plasmids bearing substrates for recombination are transfected into cells 
having chromosomal gene(s) to be evolved. Evolution can occur simply by propagating the 
culture, and can be accelerated by transferring plasmids between cells by conjugation, 
electroporation or protoplast fiision. Evolution can be fiirther accelerated by use of nuitator 
host cells or by seeding a culture of nonmutator host cells bdng evolved with mutator host 
cells and mducing mtercdlular transf^ of plasmids by electroporation, coi^gation or 
protoplast fusion. Akemativdy, recursive isolation and transformation can be used. 
Preferably, mutator host cells used for seeding contain a negative selection maricer to &dlitate 
isolation ofa pure culture oftbe nonmutator cells being evolved. Selection/screening 
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identifies cells bearing chromosomes and/or plasmids that have evolved toward acqiasition of 
a desired function. 

Subsequent rounds of recombination and selection/screening proceed in similar 
fashion to those described for plasmid-plasmid recombination. For example, further 

5 recombination can be effected by propagating cells surviving recombination in combination 
with electroporation, conjugative transfer of plasmids, or protoplast fiision. Alternatively, 
plasmids bearing additional substrates for recombination can be introduced into the surviving 
cells. Preferably, such plasmids arc from a different incompatibility group and bear a different 
selective marker than the original plasmids to allow selection for cells containing at least two 

10 different plasmids. As a further alternative, plasmid and/or chromosomal DNA can be isolated 
from a subpopulation of surviving ceDs and transfected into a second subpopulation. 
Chromosomal DNA can be cloned into a plasmid vector before transfection. 

6. Virus-Chromosome Recombination 
As in the other methods described above, the virus is usually one that does not 

IS kill the cells, and is often a phage or phagemid. The procedure is substantially the same as for 
plasmid-chromosome recombinatioa Substrates for recombination are cloned into the vector 
Vectors including the substrates can then be transfected into cells or in vitro padcaged and 
introduced into cells by infection. Viral genomes recombine with host chromosomes merely 
by propagating a culture. Evolution can be accelerated by allowing intercellular tramfer of 

20 viral genomes by electroporation, or reinfection of cells by progeny virions. 

Screening/selection identifies ceDs having chromosomes and/or viral genomes that have 
evolved toward acquisition of a deared fimctioa 

There are several options for subsequent rounds of recombination. For 
example, viral genomes can be transferred between cells sur>d^g selection/recombination by 

25 recursive isolation and transfection and electroporation. Altematively, viruses extruded from 
cdls surviving selection/smening can be pooled and used to supmnfbct the cells at high 
multiplicity. Ahematively, fresh substrates for recombination can be introduced into the cells, 
either on plasmid or viral vectors. 

CC. PQQLWISE WHOLE GENOME RECOMBINATION 
30 Asexual evolution is a slow and in^cient process. Popuhtions move as 

mdividuals rather than as a group. A ctiverse population is generated by nuitageneds of a 

singile parent, resultmg in a distribution of fit and unfit individuals. In the absence of a sexual 

cycle, each piece of genetic information for the surviving population ronains in the individual 
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mutants. Selection of the fittest results in many fit individuals being discarded, along with 
genetically useful information they carry. Asexual evolution proceeds one gaietic event at a 
time, and is thus limited by the intrinsic value of a single genetic event. Sexual evolution 
moves more quickly and efficiently. Mating within a population consolidates genetic 

5 information within the population and results in usefiil information being combined together. 
The combinmg of usefiil genetic information results m progeny that are much more fit than 
their parents. Sexual evolution thus proceeds much faster by multiple genetic events. These 
diflFeroices are fiirther illustrated in Fig. 1 7. In contrast to sexual evolution, DNA shuffling is 
the recursive mutagenesis, recombination, and selection of DNA sequences {see also. Fig. 

10 25.). 

Sexual recombination in nature effects pairwise recombination and results in 
progeny that are genetic hybrids of two parents. In contrast, DNA shuSling in vitro effects 
poolwise recombination, in which progeny are hybrids of multiple parental molecules. This is 
because DNA shuffling effects many individual pairwise recombination events with each 
1 5 thermal cycle. After many cycles the result is a repetitively inbred population, with the 

•^progeny" being the Fx ( for X cycles of reassembly) of the original parental molecules. These 
progeiQr are potentiaUy descendants of many or all of the original parents. The graph shown in 
Fig. 25 shows a plot of the potential number of mutations an individual can accumulate by 
sequential, pairwise and poolwise recombination. 
20 Poolwise recombination is an important feature to DNA shuflBing m that it 

provides a means of generating a greater proportion of the possible combinations of mutations 
fiom a single '^breeding*' e^eriment. In this way, the "genetic potential" of a population can 
be readily assessed by screening the progeny of a single DNA shufiSing e9q)eriment. 

For example, if a population consists of 10 single mutant parents, there are 2^^ 
25 1024 possible combinations of those mutations ranging fit}m progeny having 0-10 

mutations. Of these 1024, only 56 wiQ result firom a single pdrwise cross (Fig. 14) (ue those 
having 0, 1, and 2 mutations). In nature the multiparent combinations will eventually arise 
after multiple random sexual matings, assuming no selection is imparted to remove some 
mutations fix)m the population. In this way, sex effects the consolidation and sampling of all 
30 usefid mutant combinations possible within a population. For the purposes of directed 

evohxtion, having the greatest number of mutant conibinations entering a screen or selection is 
desirable so that the best progeny (i.e., according to the selection criteria used in the selection 
screen) is identified in the shortest possible time. 
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Onft challenge to m vivo and whole genome shufiBing is devising methods for 

effecting poolwise recombination or multiple repetitive pairwise recombination events. In 
crosses with a angle pairwise cross per cycle before screening, the ability to screen the 
"genetic potential" of the starting population is limited. For this reason, the rate of m vivo and 

5 vAiole genome shuflBing mediated cellular evolution would be fedlitated by effecting poolwise 
recombination. Two strategies for poolwise recombination are described below (protoplast 
fusion and transduction). 

1. Protoplast Fusion: 
Protoplast fusion (discussed supra) mediated whole genome shuffling (WGS) is 

10 one format that can directly effect poolwise recombination. Whole gene shuffling is the 

recursive recombination of whole genomes, in the form of one or more nucleic add 

molecule(s) (fragments, chromosomes, episomes, etc), from a population of organisms, 

resulting in the production of new organisms having distributed genetic information from at 

least two of the starting population of organisms. The process of protoplast fiision is fitrther 

IS iUustratedinFig.26. 

Progeny resulting from the fusion of multiple parent protoplasts have been 

observed (Hopwood & Wright, 1978), however, these progeny are rare (10"^-10'^). The low 

frequency is attributed to the distribution of fusants arising from two, three, four, etc parents 

and the likeUhood of the multiple recombination events (6 crossovers for a four parent cross) 

20 that would have to occur for muWparcnt progeny to arise. Thus, it is usefid to enrich for the 

nuiltiparent progmy. This can be accomplished, e.g., by repetitive fiidon or enrichment for 

mul^ly fused protoplasts. The process of poolwise fusion and recombination is flather 

illustrated in Fig. 27. 

2. Repetitive Fusion: 

25 Protoplasts of id^itified parental cells are prq)ared, fused and r^enerated. 

Protoplasts of the regenerated progeny are then, without screening or enrichment, formed, 
fiised and regenerated. This can be carried out for two, three» or more cycles brfore stealing 
to increase the rqnresentarion of multiiparent progeny. The number of possible 
mutations/progray doubles for each cyde. For example, if one cross produces predominantly 

30 progeiqr with 0, 1, and 2 mutations, a breeding of this population with itself produce 
progeny with 0, 1, 2, 3, and 4 mutations (Rg. 15), the tiiird cross up to right, etc. The 
representation of the mukiparent progeny from these subsequent crosses will not be as high as 
the single and double parent progeny, but it mQ be detectable and much higher than from a 
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ftinglft rrn^g'; The rqpptitivft fiifnnn prior to scre ening is analogous to many sexual CTOSses ^ 

within a population, and the individual thermal cycles of in vitro DNA shuflQing described 
supra. A factor effecting the value of this approach is the starting size of the parental 
population. As the population grows, it becomes more likely that a mukiparent fusion will 
5 arise from repetitive fijsions. For example, if 4 parents are fiised twice, the 4 parent progeny 
will make up approximately 0,2% of the total progeny. This is sufficient to find in a 
population of 3000 (95% confidence), but better representation is preferable. If ten parents 
are fused twice >20% of the progeny will be four parent offspring. 

3 . Enrichment for muMplv fused protoplasts: 

10 After the fiiaon of a population of protoplasts, the fusants are typically diluted 

into hypotonic medium, to dilute out the fiising agent (e.g., 50% PEG). The fused cells can be 
grown for a short period to regenerate cell walls or separated directly and are then separated 
on the basis of size. This is carried out, e.g., by cell sorting, using light dispersion as an 
estimate of size, to isolate the largest fusants. Alternatively the fusants can be sorted by FACS 
IS on the basis of DNA content. The large fusants or those contaming more DNA result from the 
fusion of multiple parents and are more likely to segregate to multiparent progeny. The 
enriched fiisants are regenerated and screened directly or the progeny are fiised recursively as 
above to fiirther enrich the population for diverse mutant combinations. 

4. Transduction: 

20 Transduction can theoretically effect poohvise recombination, if the transducing 

phage particles contain predominantly host genomic DNA rathCT than phage DNA. If phage 
DNA is overly represented, then most cells will recdve at least one undeared phage genome. 
Phage particles gmerated from locked-in*prophage {supra) are useful for this purpose. A 
population of ceils is infected with an appropriate transdudng phage, and the tysate is 
25 collected and used to infect the same starting population. A high multiplicity of infection is 
employed to ddivo' multiple genomic fragments to each infected cell, thereby increasing the 
diance of producing recombinants containing mutations from more than two parent gaiomes. 
The resulting transductants are recovered under conditions where phage can not propagate 
e.g., in the presence of dtrate. This population is then screened directly or infected again with 
30 phage, with the resulting transducing particles bdng used to transduce the first progeny. This 
would mimic recursive protoplast fiision, multiple sexual recombination, and in vitro DNA 

jghiiffling 
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DP. METHODS FOR WHOLE GENOME SHUFFLING BY BLIND 

FAMtt^Y SHUFFLING OF PARSED GENOMES AND RECURSIVE 
CYCLES OF FORCED INTEGRATION AND EXCISION BY 
HOMOLOGOUS RECOMBINATION. AND SCREENING FOR 
5 IMPROVED PHENOTYPES. 

In vitro methods have been developed to shuffle single genes and operons, as 

set forth, e.g., herdn. Tamily" shuffling of homologous genes within spedes and fiom 

different species is also an effective methods for accelerating molecular evolution. This 

section describes additional methods for extending these methods sudi that they can be 

10 applied to whole genomes. 

In some cases, the genes that encode rate limiting steps in a biochemical 

process, or that contribute to a phenotype of interest are known. This method can be used to 

target £unily shuffled libraries to such lod, generating libraries of organisms with hig^ quality 

fiunily shuffled libraries of alleles at the locus of interest. An example of such a gene would be 

IS the evolution of a host chaperonin to more effidently chaperone the folding of an 

overexpressed protein in E. coli. 

The goals of this process are to shuffle homologous genes from two or more 

spedes and to then integrate the shuffled genes into the chromosome of a target organism. 

Integration of multiple shuffled genes at multiple lod can be adiieved using recurave cydes of 

20 integration (generating duplications), «dsion (leaving the improved allele in the chromosome) 

and transfer of additional evolved genes by serially applying the same procedure. 

In the first step, genes to be shuffled into suitable bact^al vectors are 

subdoned. These vectors can be plasmids, cosmids, B ACS or the like. Thus, fragments from 

100 bp to 100 ld> can be handled. Homologous fragments are then *1amily shuffled" togethca* 

25 (he, homologous fragments from different spedes or chromosomal locations are 

homologously recombined). As a simple case, homologs from two spedes (say, E. coli and 

Salmonella) are cloned, &mily shuffled in vitro and cloned into an allele replacement vector 

(e.g., a vector with a positively selectable marker, a negatively selectable marker and 

conditionally active origin of replication). The basic strategy for whole genome fionily 

30 shuffling of parsed (subcloned) genomes is additionally set forth in Fig. 22. 

The vectors are transfected into E. coli and selected, e.g., for drug resistance. 

Most drug resistant cells should arise by homologous recombination between a family shuffled 

insert and a chromosomal copy of the cloned insert. Colonies with improved phenotype are 

screened (e.g., by mass spectroscopy ft)r enzyme activity or small molecule production, or a 

35 chromogenic screen, or the like, depending on the phenotype to be assayed). Negative 
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gplfirtinn (i ft Ki\r. gp.lftfitinn) k imposed to force excision of tandem duplication. Roughly half 

of the colonies should retain the improved phenotype. Importantly, this process regenerates a 
'clean* chromosome in which the wild type locus is replaced with a family shuffled figment 
that encodes a beneficial allele. Since the chromosome is "clean" (i.e., has no vector 
5 sequences), other improved alleles can also be moved into this point on the chromosome by 
homologous recombination. 

Selection or screening for improved phenotype can occur either after step 3 or 
step 4 in Figure 22. If selection or screening takes place after step 3, then the improved allele 
can be conveniently moved to other strains by, for example, PI transduction. One can then 
10 regenerate a strain containing the improved aillele but lacking vector sequences by "negative 
selection" against the sue marker. In subsequent rounds, independently identified improved 
variants of the gene can be sequentially moved into the improved strain (e.g., by PI 
transduction of the drug marked tandm duplication above). Transductants are screened for 
fiutfaer improvement in phenotype by virtue of receiving the transduced tandem duplication, 
IS which itself contains the &mily shuffled genetic material. Negative selection is again imposed 
and the process of shuffling the improved strain is recursively repeated as dedred. 

Although this process was described with reference to targeting a gene or 
genes of interest, it can be used '"blindly " making no assumptions about wMch locus is to be 
targeted. This procedure is set forth in Fig. 23. For example, the whole genome of an 
20 organism of interest is cloned into manageable fi'agments (e.g., 10 kb for plasroid-based 

methods). Homologous fragments are then isolated firom related species by the method shown 
in Figure 23. Forced recombination with chromosomal homologs creates chimeras (Fig. 22). 

EE. METHODS FOR HIGH THROUGHPUT FAMILY SHUFFLING OF 
GENES 

25 For E. coli,, cloning the genome in 10 kb fi-agmrats requires about 300 clones. 

The homologous fi-agments are isolated, e.g., fi-om Salmonella. This gives roughly three 
hundred pairs of homologous Segments. Each pair is &nily shuffled and the shuffled 
fragments are cloned into an allele replacement vector. The inserts are integrated into the K 
coli genome as described above. A global screen is made to identify variants with an 

30 improved phenotype. This serves as the basis collection of improvements that are to be 
shuffled to produce a desired strain. The shuffling of these independentiy identified variants 
into one super strain is done as described above. 

Family shuffling has been shown to be an effident method for creating high 
quality libraries of genetic variants. Given a doned gene from one spedes, it is of interest to 
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quickly and rapidly isolate homologs fiom other species, and this process can be rate limiting. 
For example, if one wants to perform family shuffling on an entire genome, one may need to 
construct hundreds to thousands of individual &mily shuffled libraries. 

In this embodiment, a gene of interest is optionally cloned into a vector in 
which ssDNA can be made. An example of such a vector is a phagemid vector with an Ml 3 
origin of replication. Genomic DNA or cDNA from a species of interest is isolated, 
denatured, annealed to the phagemid, and then enzymatically manipulated to clone it. The 
cloned DNA is then used to femily shuffle with the original gene of interest. PGR based 
fomats are also avaUable as outlined in Figure 24. These formats require no intermediate 
doning steps, and are, therefore, of particular interest for high throughput applications. 

Ahematively, the gene of interest can be fished out using purified RecA 
protein. The gene of interest is PGR amplified using primers that are tagged with an affinity 
tag such as biotin, denatured, then coated with RecA protein (or an improved variant thereof). 
The coated ssDNA is then mixed vnih a gDNA plasmid Ubrary. Under the ^propriate 
conditions, such as in the presence of non-hydrolyzable rATP analogs, RecA vnii catalyze the 
hybridization of the RecA coated gene (ssDNA) in the plasmid library. The heteroduplex is 
then affinity purified from the non-hybridizing plasmids of the gene library by adsorbtion of the 
labeled PGR products and its associated homologous DNA to an appropriate affinity matrix. 
The homologous DNA is used in a fiunily shuffling reaction for improvement of the desired 
function. 

Shuffling the R coli chaperonin gene DnaJ whh other homologs is described 
below as an example. The example can be generalized to any other gene, including eukaryotic 
genes such as plant or animal genes (inchiding mammalian genes), by following the format 
desoibed. Fig. 24 pro>ddes a schematic outiine of the stq>s to hi^ throughput fimily 
shufflmg. 

As a first step, the E. coli i)mt/ gene is cloned into an M13 phagemid vector. 
ssDNA is then produced, preferably in a dut(-) ung(-) strain so that Kunkel she directed 
mutageneas protocols can be applied. Genomic DNA is thm isolated fi:om a non* K coil 
source, sudi as Salmonella and Yersinia Pestis. The bacterial genomic DNAs are denatured 
and reannealed to the phagraiid ssDNA (e.g., about 1 microgram of ssDNA). The reannealed 
product is treated with an enzyme such as Mung Bean nudease that degrades ssDNA as an 
exonuclease but not as an endonudease (the nuclease does not degrade mismatched DNA that 
is eroibedded in a larger annealed fi^gment). The standard Kunkel site directed mutageneas 
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protocol is used to extend the fragment and the target cells are transformed with the resulting 
tnutagenized DN A. 

In a first variation on the above, the procedure is adapted to the ^tuation ^ere 
the target gene or genes of interest are unknown. In this variation, the whole genome of the 
5 organism of interest is cloned in fragments (e.g., of about 10 kb each) into a phagemid. Single 
stranded phagemid DNA is then produced. Genomic DNA from the related species is 
denatured and annealed to the phagemids. Mimg bean nuclease is used to trim away 
unhybridized DNA ends. Polymerase plus ligase is used to fill in the resulting gapped circles. 
These clones are transformed into a mismatch repair deficient strmn. When the mismatched 
10 molecules are replicated in the bacteria, most colonies contain both the E. coli and the 

homologous firagment. The two homologous genes are then isolated from the colonies (e.g., 
either by standard plasmid purification or colony PCR) and shuffled. 

Another approach to generating chimeras that requires no in vitro shuffling is 
simply to done the Salmonella genome into an allele replacement vector, transform K coli, 
1 S and select for chromosomal integrants. Homologous recombination between Salmonella 

genes and E. coli homologs generate shuffled chimotis. A global screen is done to screen for 
improved phenotypes. Attematdy, recursive transformation and recombination is performed 
to morease divmity prioi: to screening. If colonies with improved phenotypes are obtained, it 
is verified that the improvement is due to aUde replacem^t by PI transduction into a fire^ 
20 strain and counterscreening for improved phenotype. A collection of such improved alleles can 
then be combined into one strain the methods for whole genome shuffling by blind &mity 
shuffling of parsed genomes as set forth herdn. Additionally, once these loci are identifiied, it 
is likely that fiirther rounds of shuffling and sareening will yidd fiirther inq)rovements. This 
could be done by doniiig the chim^c gene and then using the methods described in ttus 
25 disdosure to breed the gene with homologs from many different strains of bacteria. 

In general, the transformants contain clones of the homologue of tiie target 
gene (e.g., E. coli DnaJ in the example above). Nfismatch repair in vivo results in a decrease 
in diversity of the gene. There are at least two solutions to this. First, transduction can be 
peifonned into a mismatch repair defident strain. Attematively or in addition, the Ml 3 
30 template DNA can be selectively degraded, leaving the cloned homologue. This can be done 
using methods similar to the standard Eckstdasite directed mutagenesis technique (Got^ 
texts which describe general molecular biolo^cal techniques usefid herdn, including 
mutagenesis, include Sambrook et al., Molecular Cloning - A Laboratory Manual (2nd Ed.), 
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Vol. 1-3, Cold Spring Haibor Laboratory, Cold Spring Haitor, New York, 1989 
C*Sambrook") and Current Protocols in Molecular Biology , F.M. Ausubel et al., eds,. Current 
Protocols, a joint venture between Greene Publishing Associates, Inc. and John Wiley & Sons, 
Inc., (supplemented through 1998) ("Ausubel")). 

This method relies on incorporation of alpha thiol modified dNTP's dxxring 
synthesis of the new strand followed by selective degradation of the template and resynthesis 
of the template strand. In one embodiment, the template strand is grown in a dut(-) ung(-) 
strain so that uracil is incorporated into the phagemid DNA. After extension as noted above 
(and before transformation) the DNA is treated with uracil glycos>4ate and an apurinic site 
endonuclease such as Endo IE or Endo IV. The treated DNA is then treated with a 
processive exonuclease that resects from the resulting gaps while leaving the other strand 
intact (as in Eckstein mutagenesis). The DNA is polymerized and ligated. Target cells are 
then transformed. This process enriches for clones encoding the homologue which is not 
derived from the target ^.e., in the example above, the non- E. coli, homologue). 

An analogous procedure is optionally performed in a PCR format. As applied 
to the DnaJ ilhistration above, DnaJ DNA is amplified by PCR with primers that build 30-mer 
priming sites on eadi end. The PCR is denatured and annealed with an excess of Salmonella 
gnomic DNA. The Salmonella DnaJ gene hybribidizes with the £. coli homologue. After 
treatment with Mung Bean nuclease, the resulting mismatched hybrid is PCR amplified with 
the flanking 30-mer primers. This PCR product can be used directly for family shufiOing. See, 
e.g.. Fig. 24. 

As genomics provides an increa^g amount of sequence information, it is 
increasingily pos^le to directly PCR anq)lify homologs "wiAi designed primers. For example, 
givoi the sequence of the £ a>U genome and of a related genome (i.e. Sabnonella), each 
genome can be PC^ampMed with designed primers in, e.g., 5 kb fragments. The 
homologous fi:agm^its can be put togetiier m a pidrwise &shion for shufiSing. For genome 
shufiSing, the shuffled products are doned into the allele replacement vector and bred into the 
gmome as described stqira 

FF. HYPER>RECOMBINOGENIC RECA CLONES 

The invention fiirther provides hyper-recombinogenic RecA protdns {see, the 

examples below). Examples of such proteins are from clones 2, 4, 5, 6 and 13 shown in Fig 

13. It is fiiUy expected that one of skill can make a variety of related recombinogeiuc protdns 

given the disclosed sequences. 
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Clones comprising the sequences in Figs. 12 and 13 are optionally used as the 
starting point for any of the shufBing methods herein, providing a starting point for mutation 
and recombination to unprove the clones which are shown. 

Standard molecular biological techniques can be used to make nucleic acids 
which comprise the given nucleic acids, e.g., by cloning the nucleic acids into any known . 
vector. Examples of appropriate cloning and sequencing techniques, and instructions 
sufficient to direct persons of skill through many cloning exercises are found in Berger and 
Kimmel, Guide to Molecular Cloning Techniques, Methods in Enzymology volume 152 
Academic Press, Inc., San Diego, CA (Berger); Sambrook et al (1989) Molecular Cloning - 
A Laboratory Martual (2nd ed.) Vol. 1-3, Cdld Spring Harbor Laboratory, Cold Spring 
Harbor Press, NY, (Sambrook); and Current Protocols in Molecular Biology, F.M. Ausubel 
et al,y eds.. Current Protocols, a joint venture between Greene Publishing Associates, Inc. and 
John Wiley & Sons, Inc., (1994 Supplement) (Ausubel). Product information from 
manufacturers of biological reagents and experimental equipment also provide information 
usefid in known biolo^cal methods. Such manufacturers include the SIGMA chemical 
company (Saint Louis, MO), R&D systms (Minneapolis, MN), Pharmacia LKB 
Biotechnology OMscataway, NJ), CLONTECH Laboratories, Inc. (Palo Alto, CA), Chem 
Genes Corp., Aldrich Chemical Company (Nfilwaukee, WI), Glen Research, Inc, GIBCO 
BRL Ufe Technologies, Inc. (Gaithersbeig, MD), Fluka Chemica-Biochetmka Analytika 
(Fluka Chemie AG, Buchs, Switzerland), Invitrogen, San Diego, CA, and Applied Biosystems 
poster City, CA), as well as many other commexdal sources known to one of skill. 

It win be appreciated that conservative substitutions of the given sequences can 
be used to produce nucldc adds vMch encode fayperrecombinogenic clones. Conservatively 
modified variations'' of a particular nucleic add sequence refers to those nucldc adds which 
encode identical or essmtially identical amino add sequences, or where the nudeic acid does 
not encode an amino add sequence, to essentially identical sequences. Because of the 
degeneracy of the genetic code, a large number of functionally idmtical nuddc adds encode 
ai^ givm polypeptide. For instance, tiie codons CGU, CCSC, CGA, CGG, AGA, and AGG all 
encode the amino add argpmine. Thus, at every position where an aigjnine is specified by a 
codon, the codon can be altered to any of the corre^nding codons described iMthout altering 
the encoded polypeptide. Such nucldc add variations are "alent variations," which are one 
spedes of "conservatively modified variations." Every nucldc add sequence herdn wluch 
mcodes a polypeptide also describes every pos^ble dlent variation. One of skill will 



127 



wo 00/04190 PCT/US99/15972 

recognize that each codon in a nucleic add (except AUG, whidi is ordinarily the only codon 
for methionine) can be modified to yield a fiinctionally identical molecule by standard 
techniques. Accordingly, each ^'silent variation** of a nucleic acid which encodes a polypeptide 
is implicit in any described sequence. Furthermore, one of skill will recognize that individual 
5 substhutions, deletions or additions which alter, add or delete a single amino acid or a small 
percentage of amino acids (typically less than 5%, more typically less than 1%) in an encoded 
sequ«ice are "conservatively modified variations" where the alterations result in the 
substitution of an amino acid vath a chemically similar amino add. Conservative substitution 
tables providing functionally similar amino acids are well known in the art. The following six 
10 groups each contain amino acids that are conservative substitutions for one another: 1) 

Alanine (A), Serine (S), Threonine (T); 2) Aspartic acid (D), Ghitamic add (E); 3) Asparagine 
(N), Glutamine (Q); 4) Arginine (R), Lysine (K); 5) Isoleucine (I), Leucine (L), Methionine 
(M), Valine (V); and 6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W). See also, 
Creighton (1984) Proteins W.K Freeman and Company. Finally, die addition of sequences 
IS which do not alter the racoded activity of a nucldc acid molecule, such as a non-fimctional 
sequence is a conservative modification of the basic nucleic add. 

One of skill will appredate that many conservative variations of the nucleic add 
constnicts disclosed yield a fimctionally identical construct. For example, due to the 
degenera^ of the genetic code, ""silent substitutions'' (/.e., substitutions of a nucleic acid 
20 sequence which do not result in an alteration in an encoded polypeptide) are an implied feature 
of every nucldc add sequence which encodes an aixuno add. Smdlariy, ""conservative ammo 
add substitutions," in one or a few amino adds in an amino add sequence of a packaging or 
packageable construct are substituted vdih (fififerent amino adds with highly similar properties, 
are also readily identified as being highly similar to a disclosed construct Such conservativdy 
2S substituted variations of each esqplidtiy dsclosed sequence are a feature of the present 
invention. 

Nucldc adds which hybridize under stringent conditions to the nucldc adds in 
the figures are a feature of the invratioa ""Stringent hybridization wash conditions'' in tiie 
context of nucldc add hybridization experimrats such as Southem and northern l^ridizations 
30 are sequence dependent, and are different under differ^ environmratal para^ An 
extenave guide to the hybridization of nucldc adds is foimd m Tijssen (1993) Laboratory 
Techniques in Biochemistry and Molecular Biology^Hybridization with Nucleic Acid Probes 
part I chapter 2 ""overview of prindples of hybridization and the strategy of nuddc add probe 
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assays", Elsevier, New York. Generally, highly stringent hybridization and wash conditions 
are selected to be about S"* C lower than the thermal melting point (Tm) for the spedfic 
sequence at a defined ionic strength and ph. The Tm is the temperature (under defined ionic 
strength and pH) at which 50% of the target sequence hybridizes to a perfectly matched probe. 
Very stringOTt conditions are selected to be equal to the Tm for a particular probe. In general, 
a signal to noise ratio of 2x (or higher) than that observed for an unrelated probe m the 
particular hybridization assay indicates detection of a specific hybridization. 

Nucleic acids which do not hybridize to each other under stringent conditions 
are still substantially identical if the polypeptides which they encode are substantially identical. 
This occurs, e,g., "when a copy of a nucleic acid is created using the maximum codon 
degeneracy permitted by the genetic code. 

Finally, preferred nucleic acids encode hyper-recombinogenic RecA proteins 
which are at least one order of magnitude (10 times) as active as a wild-type RecA protein in a 
standard assay for Rec A activity. 

GG. recE/recTNffiPIATED SHUFFLING Zy^TFO 

Like recA, recE and recT (or their homologues, for example the lambda 

recombination proteins reda and redp) can stinlulate homologous recombination m vivo. See, 

Muyrere et aL (1999) Nucleic Acids Res 27(6):1555-7 and Zhang et al. (1998) Nat Genet 

20(2):123-8 

Hyper-recombinogenic recE and recT are evolved by the same method as 
described for recA. Alteniativdy, variants with increased recomUnogenic^ are s 
thdr ability to cause recombmation between a suidde vector (lacking an origjn of replication) 
carrying a selectable marker, and a homologous region in either the chromosome or a stably- 
maintamed episome. 

A plasmid containing recA and recE genes is shufiQed (dther using these genes 
as single starting points, or by fyaaSSy ehiiffimg (with for example reda and red p , or other 
homologous genes identified from available sequence databases). This shuffled library is th^ 
doned into a vector with a selectable maricer and transformed into an appropriate 
recorobination-defidait strain. The libraiy of cells would then be transformed with a second 
selectable marker, eith^ borne on a suidde vector or as a linear DNA fragment with regions at 
its ends that are homologous to a target sequence (dther in the plasmid or in the host 
diromosome). Integration of this marker by homologous recombination is a selectable event, 
dependent on the acthdty of the redB and recT gene products. The recE / recT genes are 
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isolated firom cells in which homologous recombinadon has occurred. The process is repeated 
several times to enrich for the most effident variants before the next round of shufi9ing is 
perfonned. In addition, cycles of recombination without selection can be performed to 
increase the diversity of a cell population prior to selection. 

5 . Once hyper-recombinogenic recE / recT genes are isolated they are used as 

described for hyper-recombinogenic recA. For ©cample they are repressed (constitutively or 
conditionally) in a host cell to fadlitate homologous recombination between variant gene 
fragments and homologues within the host cell. They are alternatively introduced by 
microinjection, biolistics, itpo&ction or other means into a host cell at the same time as the 

10 variant genes. 

Hyper-recombinogenic recE/ recT (either of bacterial / phage origin, or from 
plant homologues) are useful for fadlitating homologous recombination in plants. They are, 
for example, cloned into the Agrobacterium cloning vector, where they are expressed upon 
entry into the plant, thereby stimulating homologous recombination in the recipient cell. 
IS In a preferred embodiment, recE/ recT are used and or generated in muiS 

strains. 

MUlt^TT-CYCLICRECQMBINATtON 
As rioted, protoplast fusion is an efficient means of recombining two microbial 

genomes. The process reprodudbly results in about 10% of a non-selected population bang 

20 recombinant chimeric organisms. 

Protoplasts are cells that have been stripped of thdr cell walls by treatment in 
hypotonic medium with cell wall degrading em^es. Protoplast fiision is the mduced fiidon 
of the membranes of two or more of these protoplasts by fusogenic agents such as 
polyethylene glycol. Fusion results in c^plasmic mixing and places the genomes of the fiised 

25 cells vithin the same membrane. Under these conditions recombination between the genomes 
is fi:equent. 

The fiised protoplasts are regen«:ated, and, during cdl divi^on, angle genomes 
segregate into each daughter cell. Typically, 10% of these daughter cdls have genomes that 
originate partially from more than one of the ori^nal parental protoplast genomes. 
30 This result is similar to that of the crossing over of sister chromatids in 

eukaiyotic cells during prophase of mdo^s n. The percentage of daughter cells that are 
recombinant is just lower after protoplast fiision. "While protoplast fiision does result in 
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efiSdent recombination, the recombination predominantly occurs between two cdls as in 
sexual recombinatioa 

In order to efficiently generate libraries of whole genome shuffled libraries, 
dau^ter cells having gmetic information originatmg from multiple parents are made. 

In vitro DNA shuffling results in the efficient poolwise recombination of 
multiple homologous DNA sequences. The reassembly of fiiU length genes from a mbced pool 
of small gene fragments requires multiple annealing and elongation cycles, the thermal cycles 
of the primerless PCR reaction. During each thermal cycle, many pairs of fragments anneal 
and are extended to form a combinatorial population of laiger chuneric DNA fragments. After 
the first (^cle of reassembly, chimeric fi'agments contain sequences originating from two 
different parent genes. This is similar to the result of a single sexual cycle within a population, 
panwise cross, or protoplast fiision. During the second cycle, these chimeric fragments can 
anneal with each other, or with other small fi:agments, resulting in chimeras oripnadng from 
up to four different parental sequences. 

This second cycle is analogous to the entire progeny from a single sexual cross 
inbreeding with itself Further cydes will result in chimeras ori^nating from 8, 16, 32, etc 
parental sequences and are analogous to fiirther inbreedings of the progeny population. The 
power of in vitro DNA shuffling is that a large combinatorial library can be generated firom a 
dngle pool of DNA fiBgmmts reass^led by these recurve pairwise ^^matings." As 
des(^ed above, in vrvo shufiOing strat^es, sudi as protoplast fusion, result in a angle 
paiiwise mating reaction. Thus, to generate the level of diversity obtained by in W/h? 
methods, in vivo methods are carried out recursively. That is, a pool of organisms is 
recombined and the progray pooled, without selection, and then recombmed again. This 
process is repeated fi>r suffident cydes to result in progeny having multiple parental . 
sequences. 

Described bdow is a method used to shuffle four strains ofStreptomyces 
coeUcolar. From the initial four strams each contdning a unique nutritional marker, three to 
four rounds of recurave pooled protoplast fusion was suffident to generate a population of 
shuffled organisms containing all 16 pos^le combinations of the four markers. This 
represents a 10^ fold improvement in the gmeration of four parent progeny as con^ared to a 
angle pooled fiision of the four strains. 

As set forth in Figure 3 1, protoplasts were generated from several strains of 5. 
coe//co/or, pooled and fiised. Mycdia were regenerated and allowed to sporulate. The spores 
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x>/prff pr>11firtfi/1 fllinwpi^ tn gmw mtn Mycelia^ formed in to protoplasts^ pooled a nd fused and 

the process repeated for three to four rounds, the resulting spores were then subject to 
so-eening. 

The basic protocol for generating a whole genome shuflSed library from four 5. 
5 coelicolor strains, each having one of four distinct markers, was as follows. Four mycelial 
cultures, each of a strain having one of four different markers, were grown to early stationary 
phase. The mycelia from each were hanrested by centri&gation and washed. Protoplasts from 
each culture were prepared as follows. 

^proximately lO' S, coelicolor spores were inoculated into 50ml YEME with 
10 .0.5% Glycine in a 250ml baflQed flask. The spores were incubated at 30X for 36-40 hours in 
an orbital shaker. Mycelium were verified u^g a microscope. Some strains needed an 
adcUtional day of growth. The culture was transferred into a 50ml tube and centrifuged at 
4,000 rpm for 10 min. The mycelium were twice washed with 10.3% sucrose and centrifiiged 
at 4,000 rpm for 10 min. (mycelium can be stored at -80°C after wash), 5ml of lysozyme was 
1 5 added to the -0.5g of nq^celium pellet. The pellet was suspended and incubated at 30^C for 
20-60 min., with gentle shaking every 10 min. The microscope was checked for protoplasting 
every 20 min. Once the majority were protoplasts, protoplasting was stopped by adding 10ml 
of P bufier. The protoplasts were fihered through cotton and the protoplast spun down at 
3,000rpm for 7 min at room temperature. The supernatant was (Uscarded and the protoplast 
20 gently resuspended, adding a suitable amount of P bufifer according to the pellet size (usually 
about 500|il). Ten-fold serial dilutions were made in P bufier, and the protoplasts counted at a 
10'^ dilution. Protoplasts were adjusted to 10^^ protoplasts per ml. 

The protoplasts from each culture w^e quandtated by microscopy. 1 0 
protoplast from each culture were mixed in the same tube, washed, and thra fused by the 
25 addition of 50% PEG. The fused protoplasts were diluted and plated r^eneration medium 
and incubated until the colonies were sporuladng (four days). Spores were harvested and 
washed. These spores represent a pool of all the recombinants and parents form the fiision. 
A sample of the pooled spores was then used to inoculate a single liquid culture. The cuhure 
was grown to early stationary phase, the mycleiia harvested, and protoplasts prepared. 10^ 
30 protoplasts from this "mycelial libraiy" were then fiised wiA themselves by the addition of 
50%PEG. The protoplast fosion/regencration/haivesting/protoplast preparation steps were 
repeated two times. The spores resulting from the fourth round of fiision were considered the 
'Svhole genome shuffled iibrar/' and they were scremed for the frequency of the 16 possible 
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mmhinations of the four markers. The results from each round of fusion are shown figure 33 
and in the following table. 

The results of the shuflQing procedure are set forth in Figure 33. In particular, 
adding rounds of recombination prior to selection produced significant increases in the number 
5 of clones which incorporated all four of the relevant selectable markers, indicating that the 
population became increasingly diverse be recursive pooling and sporulatioa Additional 
results are set forth in the following table. 
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The four strains of the four paren t Rhufflmp; were each auxotrophic for three 

and prototrophic for one of four possible nutritional markers: arginine (A), cystine (C), proling 
(P), and/or uracil (U). Spores from each fiision were plated in each of the 16 possible 
combinations of these four nutrients, and the percent of the population grooving on a 

S particulate medium was calculated as the ration of those colonies form a selective plate to 
those growing on a plate having all four nutrients (all variants grow on the medium having all 
four nutrients, thus the colonies from this plate tus represent the total viable population). The 
corrected percentages for each of the no, one, two, and three marker phenotypes were 
determined by subtracting the percentage of cells having additional markers that nugfat grow 

10 on the medium having ^^lnnecessary" nutrients. For example, the number of colonies growing 
on no additional nutrients (the prototroph) was subtracted from the nimiber of colonies 
growing on any plate requiring nutrients. 

II. WHOLE GENOME SHUFFLING THROUGH O T^rTA^zm 
HETERODUPLEX SHUFFLING 
15 A new procedure to optimize phenotypes of interests by heteroduplex shuflEling 

of cosmids libraries of the organism of choice, is provided. This procedure does not require 

protoplast fusion and is applicable to bacteria for which well-established genetic systems are 

available, including cosmid cloning, transformation, in vitro packaging/transfection and 

plasmid transfer/mobilization. Microorganism that can be improved by these methods include 

20 Escherichia coli, Pseudomonas aeruginosa, Pseudomonas putida, Pseudomonas spp,, 

Rhizobium spp,, Xanthomonas spp., and other gram-negative organisms. This method is also 

applicable to Gram-positive microorganisms. 

A basic procedure for whole genome shufiBing through organized heteroduplex 

shufiOing is set forth in Figure 34. 

25 In step A, Chromosomal DNA of the organism to be improved is digested with 

suitable restriction enzymes and ligated into a cosmid. The cosmid used for cosmid-based 

heteroduplex guided WGS has at least two rare restriction enzyme recognition sites (e.g. Sfr 

and NotI) to be used for linearization in subsequent steps. Suffident cosmids to represent the 

complete chromosome are purified and stored in 96-well microtiter dishes. In step B, small 

30 samples of the library are mutagenized m vitro using bydroxylamine or other mutagenic 

chemicals. In step C, a sample from each well of the mutagenized collection is used to 

transfect the target cells. In step D, the transfectants are assayed (as a pool from each 

mutagenized sample-well) for phenotypic improvements. Positives from this assay indicate 

that a cosmid from a particular well can confix phenotypic inqprovements and thus contain 
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largft gftnnmin fragmftnts that nre Rtritahle targe ts for heteroduplex medt ^^y^ jfhiifflin g. In ^ep 

E, the transfected cells harboring a mutant libraiy of the identified cosniid(s) are separated by 
plating on solid media and screened for independent mutants conferring an improved 
phenotype. In step F, DNA from positive cells is isolated and pooled by origin. In step G, the 

S selected cosmid pools are divided so that one sample can be digested with Sfr and the other 
withNotl. These samples are pooled, denatured, reannealed, and religated. 

In step H, target cells are transfected with the resulting heteroduplexes and 
propagated to allow '^recombination" to occur between the strands of the heteroduplexes in 
vivo. The transfectants can be screened (the population will represent the pairwise 

10 recombinants) or, commonly, as represented by step I, the recombined cosmids are fiirther 
shuffled by recursive in vitro heteroduplex formation and in vivo recombination (to generate a 
complete combinatorial library of the possible mutations) prior to screening. An additional 
mutagenesis step could also be added for increased diversity during tiie shuffling process. 

In step J, once several cosmids harboring different distributed loci have been 

IS improved, they are combined into the same host by chromosome integration. This organism 
can be used directly or subjected to a new round of heteroduplex guided whole genome 

yhuflFling 

EXAMPLES 

The following examples are offered to illustrate, but not to Urait the present 
20 invention. Essentially equivalent variations upon the exact procedures set forth will be 
apparent to one of skill upon review of the present disclosure. 

A EXAI^LEl: EVOLVING HYPER>RECOMBINOGENICRECA 
RecA protdn is in^licated in most K coli homologous recombination 

pathways. Most mutations in recA inhibit recombination, but some have been reported to 
25 increase recombination QCowalczykowski et al., Microbiol. Rev,, 58, 401-465 (1994)). The 

following example describes evolution of RecA to acquire hyper-recombinogenic activity 

usefijl in m vivo shuffling fijnnats. 

Hyperrecombinogenic RecA was selected using a modification of a system 

developed by Shen et al., Genetics 1 12, 441-457 (1986); Shen et al., Mol Gen. Genet 218, 
30 358-360 (1989)) to measure the effect of substrate length and homology on recombination 

frequency, Shen & Huang's system used plasmids and bacteriophages with small (3 1-430 bp) 

re^ons of homology at \i^ch the two could recombine. In a restrictive host, only phage that 

had incorporated the piasmid sequence were able to form plaques. 
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For shuffling of recA. endogenous recA and rmitS were deleted from host 

Strain MCI 061. In this strain, no reconibiaation was seen between plasmid and phage. Kcoli 
recA was then doned into two of the recombination vectors (Bp22 1 and 7iMT63 Icl 8). 
Plasmids containing cloned RecA were able to recombine with homologous phage: XV3 (430 
5 bp identity with Bp221),XV13 (430 bp stretch of 89% identity with Bp221) and JOink H (31bp 
identity with 7iMt631cl8, except for 1 mismatch at position 18). 

The cloned RecA was then shuffled in vitro using the standard DNase- 
treatment followed by PCR-based reassembly. Shuffled plasmids were transformed into the 
non-recombining host strain. These cells were grown up overnight, infected with phage Wc, 
10 W13 or Wink H, and plated onto NZCYM plates in the presence of a 10-fold ©ccess of 

MC1061 lacking plasmid. The more efficiently a recA allele promotes recombination between 
plasmid and phage, the more highly the allele is represented in the bacteriophage DNA. 
Consequently, harvesting all the phage from the plates and recovering the recA gmes selects 
for the most recombinogenic rec A alleles. 
1 5 Recombination frequencies for wild type and a pool of hyper-recombinogenic 

RecA after 3 rounds of shufiQing were as foOows: 

CiPW Wild Type HvperRecom 

BP221xV3 6.5x10'^ 3.3x10"^ 

BP221xV13 2.2x10'^ 1.0x10"^ 

20 °MT631cl8xlinkH 8.7x10^ 4.7x10'' 

These results indicate a SO-fold increase in recombination for the 430 bp substrate, and a S- 
fold increase for the 3 1 bp substrate. 

The recombination frequency between BP221 and V3 for five individual clonal 
isolates are shown below, and the DNA and protein sequences and alignments thereof are 
included in Figs. 12 and 13. 
WUdtype: 1.6xlff^ 
Clone 2: 9.8 x 10"^ (61 x increase) 
Clone 4: 9.9 x 10'^ (62 x increase) 
Clone 5: 6.2 x 10'^ (39 x mcrease) 
Clone 6: 8, 5 x 1 0*^ (53 x increase) 
Clone 13: 0.019 (1 16 x increase) 

Clones 2, 4, S, 6 and 13 can be used as the substrates in subsequent rounds of shufiBing, if 
furtho' improvement in recA is deared. Not all of the variations from the wildtype recA 



25 



30 



137 



wo 00/04190 PCTAJS99/I5972 ^ . 

RftquRnce nftCRSsarily cnntrihiite tn the hyperrecnmhinngenic phenotype Silent variations cyi 

be diminated by backcrossing. Alternatively, variants of recA incorporating individual points 
of variation from wildtype at codons 5, 18, 156, 190, 236, 268, 271, 283, 304, 3 12, 317, 345 
and 353 can be tested for activity. 

5 B. EXAMPLE 2: WHOLE ORGANISM EVOLUTION FOR HYPER> 

RECOMBINATION 

The possibility of selection for an £ coli strain with an increased level of 
recombination was indicated from phenotypes of wild-type, ^recA, mutS and ^ecA mutS 
strains following exposure to mitomycin C, an inter-strand cross-linking agent of DNA 
10 Exposure of £. coli to mitomydn C causes inter*strand cross-linking of DNA 

thereby blocking DNA replication. Repair of the inter-strand DNA cross links in E. coli 
occurs via a RecA-dependent recombinational repair pathway (Friedberg et al., in DNA Repair 
cmd Mutagenesis {1995) 1^1^. 191-232), Processing of cross-links during repair results in 
occasional double-strand DNA breaks, which too are repaired by a RecA-dependent 
1 5 recombinational route. Accordingly, recA' strains are significantly more sensitive than 
wildtype strains to mitomycm C e7q)osure. In fact, mitomycin C is used in simple disk- 
sensitivity assays to differentiate between RecA^ and RecA' strains. 

In addition to its recombinogenic properties, mitomycm C is a mutagen. 
Exposure to DNA dama^g agents, such as mitomycin C, typically results in the induction of 
20 the E, coli SOS regulon which includes products involved in OTor-prone repair of DNA 
damage (Friedberg et al., 1995, supra, at pp. 465-522). 

Following phage PI -mediated generalized transduction of the ^ixecA- 
sr/): :TnlO allele (a nonfunctional aUele) into wild-type and mutS E. coli, tetracycline-resistant 
transductants were screened for a phenotype using the mhomydn C-sensitivity assay. It 
25 was observed in LB overlays with a 1/4 inch filter disk saturated with 10 pg of mitomycin C 
following 48 hours at STC, growth of the wild-type and mutS strains was inhibited within a 
region with a radius of about 10 mm from the center of the disk. DNA cross-linking at high 
levels of mitomycin C saturates recombinational repair resulting in lethal blockage of DNA 
replication. Both strains gave rise to occasional colony forming units within the zone of 
30 inhibition, although, the frequency of colonies was ~1 0-20-fold higher in the mutS strain. This 
b presumably due to the increased rate of spontaneous mutation of mutS backgrounds. A 
side-by-side comparison demonstrated that the hrecA and b^ecA mutS strains were 
significantly more sensitive to mitomycin C with growth inhibited in a re^on extoiding about 
15 mm from the center of the disk. However, in contrast to the recA^ strains, no Mit' 
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inHiv^HiiaU WPTP qpph within thp rftginn nf growth inhihitinn-not even in the wt//^ background. 

The appearance of Mit*' individuals in nc/i backgrounds, but not in LrecA backgrounds 
indicates the Nfit^ is dependent upon a functional RecA protein and suggests that Mit"^ may 
result from an increased capacity for recombinational repair of mitomycin C^induced damage. 

S Mutations which lead to increased capacity for RecA-mediated recombinational 

repair may be diverse, unexpected, unlinked, and potentially syner^stic. A recursive protocol 
alternating selection for Mit*^ and chromosomal shuffling evolves individual cells with a 
dramatically increased capacity for recombination. 

The recursive protocol is as follows. Following exposure of a mutS strain to 

10 mitomycin C, MSt' individuals are pooled and cross-bread [e.g., via Hfr-mediated 

chromosomal shuffling or split-pool generalized transduction, or protoplast fusion). Alleles 
which result in and presumably result in an in^eased capacity for recombinational repair 
are shuffled among the population in the absence of mismatch repair. In addition, error-prone 
repair following exposure to mitomycin C can introduce new mutations for the next round of 

1 S shuffling. The process is repeated using increasing^ly more stringent exposures to mitomycin 
C. A number of parallel selections in the first round as a means of generating a variety of 
alldes. Optionally, recombinogendty of isolates can be monitored for hyper-recombination 
using a plasmid x plasmid assay or a chromosome x chromosome assay (e.g., that of Konrad, 
J. BacterioL 130, 167-172 (1977)). 

20 C. EXAMPLE 3: WHOLE GENOME SHUFFLING OF STREPTOMYCES 

COEUCOLOR TO IMPROVE TOE PRODUCTION OF 7 
■ACTINORHODIN. 

To improve the production of the secondary metabolite Y-actinorhodin from S. 
coelicolor, the entire genome of this oi^anism is shuffled dther alone or with its dose relative 
25 S, lividans. In the first procedure described below, genetic diversity arises fi-om random 
mutations generated by chenwcal or physical means. In the second procedure, genetic 
diversity arises from the natural diversity existing between the genomes of S. coelicolor and 5. 
lividms. 

Spore suspensions of 5. coelicolor are resuspended in sterile water and 
30 subjected to UV mutagenesis such that 1% of the spores survive (-600 "energy" units using a 
Stratalinker, Stratagene), and the resulting mutants are "grown out" on sporulation agar. 
Individual spores represent uninucleate cells harboring different mutations within their 
genome. Spores are collected, washed, and plated on solid medium, preferably soy agar, R5, 
or oth^ rich mediimi that results in sporulating colonies. Colonies are then imaged and picked 
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randomly using an automated colony picker, for example the Q-bot (Genetix). Alternatively 
colonies producing larger or darker halos of blue pigment are picked in addition or 
preferentially. 

The colonies are inoculated into 96 well microtitre plates containing 1/3 x 
S YEME medium (1 70pl /well). Two sterile 3mm glass beads are added to each well, and the 

plates are shaken at 1 50-250 rpm at 30 in a humidified incubator. The plates are incubated 
up to 7 days and the cell supematents are assayed for Y-actinorhodin production. 

To assay, 50pL of supernatant is added to 100)iL of distilled water in a 96 well 

polypropylene microtitre plate, and the plate is centrifiiged at 4000 rpm to pellet the mycelia. 

10 50 of the cleared supanatant is then r^oved and added to a flat bottom polystyrene 96 
well microtitre plate containing 150 ^ IM KOH in each well. The resulting plates are then 
read in a microtitre plate reader measuring the absorbance at 654 nm of the individual samples 
as a measure of the content Y-actinorhodin. 

Mycelia fi-om cultures producing Y-actinorhodin at levels significantly higher 

15 than that of wUdtype 5. coelicolar are then isolated. These are propagated on solid 

sporulation medium, and spore preparations of each improved mutant are made. From these 
preparations protoplasts of each of the improved mutants are generated, pooled together, and 
fiised (as described in Qeqetic MampulatjQq Qf Sftq)tQn>yces -A la^)9yatory Manual, 
Hopwood, D:A., et aI.X The fiised protoplasts are regenerated and allowed to sporulate. 

20 Spores are collected and dther plated on solid medium for fiirther picking and screening, or, 
to increase the representation of multiparent progeny, are used to generate protoplasts and 
fiised again (or several times as described previously for methods to effect poolwise 
recombination) before fiirth^ jncking and screening. 

Further unproved mutants result firom the combination of two or more 

25 mutations that have additive or synergistic efifects on g-actinorhodin production Further 
improved mutants can be again mated by protoplast poolwise fusion, or they can be exposed 
to random mutagenesis to create a new population of cells to be screened and mated iot 
fiirther improvements . 

As an ahonative to random mutagenesis a source of genetic diver^ty, natural 

30 divosity can be employed. In this case, protoplasts generated firom wildtype £ coelicolor and 
S. Imdans are fiised together. Spores fix>m the regenerated progeny of this mating are then 
dther repetitively fiised and regenerated to create additional diversity, or tiiey are separated on 
solid medium, picked, and screened for enhanced production of g-actinoriiodin. As before, the 
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'waproved subpopulatioa are mated together to identiiy further improved famOy shuffled 
organisms. 

D. EXAMPLE 4: A HIGH THROUGHPUT ACTINORHODIN ASSAY 
Additional Details on a high-throughput shufiOing actinoiiiodin assay used to 

select mycelia are set forth in Figure 32. In brief, shufflants were picked by standard 
automated procedures using a Q-bot robotic system and transferred to standard 96 well plates. 
After incubation at 3(fC for 7 days, the resulting mycelia were centrifuged, and a sample of 
cell supernatant was removed and mixed with 0. 1 M KOH in a 96 well plate and the 
absoxbance read at 654nra. The best positive clones were selected and grown in shake flasks. 

Approxunately 10^ protoplasts were centrifuged at 3,000rpm for 7 min. When 
more than one strain was used, equal number of protoplasts were obtained from each strain. 
Most of the buffer was removed and the pellet suspended in the remaining buffer (--25nl total 
volume) by gentle flicking. 0.5ml of 50% PEGIOOO was added and mfaced with the protoplasts 
by gently pipetting in and out 2 times. The mixture was then incubated for 2 minutes. 0.5ml 
of P buffer was added and gently mixed. (This is the fiision at a dihition of 1 0'*). A ten-fold 
serial dilution was performed in P buffer. After 2 minutes, dilutions were plated at 10^ 10"" 
and 10'' onto R5 plates with 50nl of each, 2^ plates each dilution, (for plating, -20 of 3mm 
glass beads were used, gentie shaldng). As a first control, for regeneration of protoplasts, the 
same number of protoplasts were used as above, adding P buffer to a total of 1ml (this is the 
regeneration at dilution 10"^). The mbcture was fiirther diluted (lOX) in P buffer. The 
dihitions were plated at 10"^, 10"^ and 10** onto R5 plates with 50nl of each. As a second 
control, (as a non-protoplasting mycelia background check) the same number of protoplasts as 
above were used adding 0.1% SDS to a total of 1ml (this is the background at dilution 10'^). 
After fiirther lOX dilution in 0.1% SDS, the dihition was plated at 10'\ 10*^ and 10"^ onto R5 
plates with 50^1 of each. The plates were air dried and Incubated at 30"C for 3 days. 

The number of colonies was counted firom each plate (those that w^e 
countable), using the number of regenerated protoplast as 100% and calculating the 
pmentage of bacl^round (usually less than one) and fiision survival (usually greater than 10). 
The fiision plates wctc inaibated at 30*C for 2 more days until all colonies wwe well 
sporulated. Spores were harvested fi'om those plates having less than 5,000 colonies. Spores 
were filtered through cotton and washed once vnAi water, suspended in 20% Glycerol and 
counted. Those spores are used for fiirther study, culture inoculation or simply stored at - 
20*C. 
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E. EXAMPLE 4: WHOLE GENOME SHUFFLING OF RHODOCOCCUS 
FOR TWO>PHASE REACTION CATALYSIS 

This example provides an example of how to apply the techniques described 
herein to technologies that allow the generic improvement of biotransformations catalyzed by 
S whole cells. Bhodococcus was selected as an initial target because it is both repres^ative of 
systems in which molecular biology is ludimentary (as is common in whole cell catalysts which 
are generally selected by screenmg environmental isolates), and because it is an oiganism that 
can catalyze two-phase reactions. 

The goal of whole genome shuffling of Rhodococcus is to obtain an increase in 
10 flux through any chosen pathway. The substrate spedficity of the pathway can be altered to 
accept molecules which are not currently substrates. Each of these features can be selected for 
during whole genome shuffling. 

During whole genome shuffling, libraries of shuffled enzymes and pathways are 
made and transformed into Rhodococcus and screened, preferably by high-throughput assays 
1 S for improvements in the target phenotype, e.g., by mass spectroscopy for measuring the 
product. 

As noted above, the chromosomal context of genes can have dramatic effects 
on their activities. Cloning of the target genes onto a small plasmid in Rhodococcus can 
dramatically reduce the overall pathway acthnty (by a &ctor of 5- to 1 0-fold or more). Thus, 
20 the starting point for DNA shuffling of a pathway (on a plasmid) can be 10-fold lower than the 
activity of wild-type strain. By contrast, integration of the genes into random sites in the 
Rhodococcus chromosome can result in a significant (S- to 10-fold) increase in actiwty. A 
similar phenomenon was observed in the recent directed evohition in E coli of an arsenate 
resistance operon (originally from Staphylococcus aureus) by DNA shuffling. Shuffling of tins 

25 plasmid produced sequence changes that led to efficient integration of the operon into the E 
coli chromosome. Of the total 50-fold increase in arsenate resistance obtained by directed 
evolution of the three gene pathway, approximately 1 0-fold resulted from this integration into 
the chromosome. The position within the chromosome is also likely to be important: for 
example sequences close to the replication origin have an effectively higher gene dosage and 

30 therefore greater expression level. 

In order to fiilly exploit unpredictable chromosomal position effects, and to 
incorporate them into a directed evolution strategy which utilizes multiple cycles of mutation, 
recombination and selection, genes are manipulated in vitro and then transferred to an optimal 
chromosomal position. Recombination betwera plasmid and chromosome occurs in two 
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difierent ways. Integration takes place at a position where there is significant sequence ^ 
homology between plasniid and chromosome, i.e., by homologous recombination. Integration 
also takes place ^ere there is no apparent sequence identity, i.e., by non-homologous 
recombination. These two recombination mechanisms are effected by different cellular 
machineries and have different potential applications in directed evolution. 

To combine the increase in activity that resulted from gene duplication and 
chromosomal integration of the target pathway with the powerful technique of DNA shuffling, 
libraries of shuffled genes are made in vitro, and integrated into the chromosome in place of 
the wild-type genes by homologous recombination. Recombinants are then be screened for 
increased activity. This process is optionally made recursive as discussed herein. The best 
Rhodococcus variants are pooled, and the pool divided in two. Genes are cloned out of tiie 
pool by PGR, shufBed together and re-integrated into the chromosomes of the other half of the 
pool by homologous recombination. Recombinants are once agam be screened, the best taken 
and pooled and the process optionally repeated. 

Sometimes there are complex interactions between enzymes catalyzing 
successive reactions in a pathway. Sometimes the presence of one enzyme can adversely 
affect the activities of othm m the pathway. This can be the result of protan-protein 
interactions, or mhibition of one enzyme by the product of another, or an imbalance of primary 
or secondary metabolism. 

This problem is overcome by DNA shuffling, which produces solutions in the 
target gene cluster that bring about improvements in whatever trait is screened. An ahemative 
approach, \^ch can solve not only this problem, but also anticipated future rate limiting steps 
such as supply of reducing power and substrate transportation, is complementation by 
over^r^on of other as yet unknown genomic sequences. 

A library of Rhodococcus genomic DNA in a multicopy Rhodococcus vector 
such as pRCl is first made. This is transfonned into /ZAac&xrocctfj and transfoni^ 
screened for increases in the desired phenotype. Genomic fragments which result in increased 
pathway activity are evolved by DNA shufiSing to further increase their benefidal effect on a 
selected property. This approach requires no sequence information, nor any knowledge or 
assumptions about the nature of protein or pathway interactions, or even of the rate -limiting 
step; it relies only on detection of the desred phenotype. This sort of random cloning and 
subsequent evolution by DNA shuflBing of positively interacting genomic sequences is 
extremely powerfiii and generic. A variety of sources of genomic DNA are used, fi'om 
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isogenic strains to more distantly related species with potentially desirable properties. In ^ 
addition, the technique is, in principle, applicable to any microorganism for which the 
molecular biology basics of transformation and cloning vectors are available, and for any 
property which can be assayed, preferably in a high-througjiput format. 

Homologous recombination within the chromosome is used to circumvent the 
limitations of plasmid-evolution and size restrictions, and is optionally used to alter central 
metabolism. The strategy is similar to that described above for shufiQing genes within their 
chromosomal context, except that no in vitro shuffling occurs. Instead, the parent strain is 
treated with mutagens such as ultraviolet light or nitrosoguanidine, and improved mutants are 
selected. The improved mutants are pooled and split. Half of the pool is used to generate 
random genomic fiagments for doning into a homologous recombination vector. Additional 
genomic fragmats are derived from related species with desirable properties (in this case 
higher metabolic rates and the ability to grow on cheaper carbon sources). The cloned 
genomic fragments are homologously recombined into the genomes of the remaming half of 
tiie mutant pool, and variants with improved phenotypes are selected. These are subjected to 
a further round of mutagenesis, selection and recombination. Again this process is entirely 
g»6ric for the inq>rovement of any whole cell biocatalyst for which a recooibination vector 
and an assay can be developed. Recursive recombination can be performed to increase the 
(fiversity of the pool at any step in the process. 

Efficient homologous recombination is important forYhe recursivity of the 
diromosomal evolution strategies outlined above. Non-homologous recombination results in 
a fiitile integration (upon selection) followed by exdsion (following counterseiection) of the 
entire piasmid. Alternative^, if no counter-selection were used, there is integration of more 
and more copies of piasmid / genomic sequences which is both unstable and also reqiures an 
ad(£tional selectable marker for each cycle. Furtfaomore, additional non-homologous 
recomibination will occur at random positions and may or may not lead to good expression of 
the integrated sequence. 

F EXAMPLE S: INCREASING THE RATE OF HOMOLOGOUS 
RECOMBINATION IN RHODOCOCCUS 

A genetic s^proach is used to increase the rate of homologous, recombination 
in Ehodococcus. Both taigeted and non-targeted strategies to evolve increases in homologous 
recombination are used. Ehodococcus recA is evolved by DNA shufiSing to increase its ability 
to promote homologous recombination within the chromosome. The recA gene was chosen 
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because there are variants of recA known to result in increased rates of homologous ^ 
recombination in E colt as discussed above. 

The recA gene from Rhodococcus is DNA shuffled and cloned into a plasmid 
that canies a selectable marker and a disrupted copy of the Rhodococcus homolog of the S 
cerevisiae URA3 gene (a gene which also confers sensitivity to the uracil precursor analogue 
S-fluoroorotic add). Homologous integration of the plasmid into the chromosome disrupts 
the host uracil synthesis pathway leading to a strain that carries the selectable marker and is 
also resistant to 5-fluoroorotic acid. The shuffled recA genes is integrated, and can be 
amplified from the chromosome, shuffled again and cloned back bto the integration-selection 
vector. At each cycle, the recA genes promoting the greatest degree of homologous 
recombination are those that are the best represented as integrants in the genome. Thus a 
Rhodococcus recA with enhanced homologous recombination-promoting activity is evolved. 

Many other genes are involved in several different homologous recombination 
pathways, and mutations in some of these proteins may also lead to cells with an increased 
level of homologous recombination. For example mutations in E coli DNA polymerase m 
have recently bem shown to increase RecA-independent homologous recombiiiation. 
Resistance to DNA cross-linldng agents such as nitrous acid, mitomydn and ultraviolet are 
dq>endent on homologous recombination. Thus, increases in the activity of this pathway 
result in increased resistance to these agents. Rhodococcus cells are mutagenized and selected 
for increased tolerance to DNA cross-linking agents. These mutants are tested for the rate at 
which a plasmid will integrate homologously into the chromosome. Genomic libraries are 
prepared from these mutants, combined as described above, and used to evolve a strain with 
even higher levels of homologous recombmation. 

The foregomg description of the preferred embodiments of the present 
invention has been presented for purposes of illustration and description. Thqr are not 
intended to be exhaustive or to limit the invention to the precise form disdosed, and many 
modifications and variations are possible in li^t of the above teaching. Such modifications 
and variations which may be q}parent to a p^son skilled in the art are intended to be within 
the scope of this invention. All patent documents and publications dted above are 
incorporated by reference in thdr entirety for all purposes to the same extent as if each item 
were so individually denoted. 
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WHAT IS CLAIMED IS: 



1 1. A method of producing a library of div^e multicellular organsims, the 

2 method comprising: 

3 providing a pool of male gametes and a pool of female gametes, wherdn at least one of 

4 the male pool or the female pool comprises a phuality of differrat gametes derived from different 

5 strains of a spedes or different species, ^erem the male gametes fertilize the female gametes; 

6 permittmg at least a portion of the resulting fertilized gametes to grow into reproductively 

7 viable organisms; 

8 repeatedly crossing the reproductively viable organisms to produce a library of diverse 

9 oiganisms; and, 

10 selecting the library for a desired trait or property. 

1 2. The method of claim 1 , wherein the library of diverse organisms comprise a 

2 plurality of plants. 

1 3, The method of claim 2, wherein the plants are selected from: Grcmineae, 

2 Fetucoideae, Poacoideae, Agrostis, Phleum, Dactylis, Sorgum, Setaria, Zea, Oryza, Triticum, 

3 Secale, Avena, Hordeum, Saccharum, Poa, Festuca, Stenotaphrum, Cynodon, Coix, Ofyreae, 

4 Phareae, Compositae, and Legiminosae. 

1 4. The method of claim 2, wherdn the plants are selected from com), rice, 

2 wheat, rye, oats, bariey, pea, beans, lentil, peanut, yam bean, cowpeas, velvet beans, soybean, 

3 dover, alfalfa, lupine, vetch, lotus, sweet clover, wisteria, sweetpea, sorghum, millet, sunflower, 

4 and canola. 

1 5. The method of daim 1 , wherein the library of (U verse organisms comprise a 

2 plurality of animals. 

1 6. The method of daim 5, wherein the animals are selected from non-human 

2 mammal^ and fish. 

1 7. The library produced by the method of claim 1 . 
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1 8. The method of claim 1, further comprising: 

2 crossing a plurality of selected library members by pooling gametes from the selected 

3 members and repeatedly crossing any resulting additional rcproductively viable organisms to 

4 . produce a second library of diverse organisms; and, . 

5 selecting the second library for a desired trait or property. 

1 9. The second library made by the method of claim 8. 

1 10, A method of evolving a cell to acquire a desired property, comprising: 

2 (i.) forming protoplasts of a population of different cells; 

3 (ii.) fusing the protoplasts to form hybrid protoplasts, in which genomes from the 

4 protoplasts recombine to form hybrid genomes; 

5 (iii.) incubating the hybrid protoplasts under conditions promoting regeneration of 

6 cells, thereby producing regenerated cells; 

7 (iv.) repeatedly forming protoplasts from the regenerated cells, fiising the 

8 protoplasts to form hybrid protoplasts, in which genomes from the protoplasts recombine to form 

9 additional hybrid genomes; incubating the additional hybrid protoplasts under conditions 

10 promoting regeneration of cells, thereby producing additional regenerated cells; and, 

1 1 (v.) selecting or screening to isolate regenerated cells or additionally regenerated 

1 2 cells that have evolved toward acquisition of the desbed property. 

1 11. The method of claim 10, wherein the desired property is selected from: heat 

2 tolerance, ethanol production, ethanol tolerance, add, improved production and maintanance of 

3 CTzyme cofactors, improved production and maintanance of NAD(P)H, and improved glucose 

4 transport. 

1 12. The method of claim 1 0, fiirther compri^g repeating steps ^.H^ ) with 

2 regenerated cells in step (iii.) or additional regenerated cells in step (iv.) being used to form the 

3 protoplasts in step (i.) until the regenerated cells have acquired the desired property. 

1 13. The method of claim 10, conq)rising step (iv), wherdn step (iv). is performed 

2 prior to step (v.). 
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1 14. The method of claim 1 0, v^erem the hybrid protoplasts comprise cells having 

2 more than two parental genomes. 

1 15. The method of claim 10, wherein the dififerent cells are fungal cells, and the 

2 r^enerated cells are fimgi mycelia. 

1 16, The method of claim 1 5, wherein protoplasts are provided by treating mycelia 

2 or spores with an enzyme. 

1 17. The method of claim 1 5, wherem the fiingal cells are from a fra^e strain, 

2 lacking capacity for intact cell wall synthesis, whereby protoplast form spontaneously. 

1 18. The method of claim 1 5, fiirther comprising treating the mycelia with an 

2 inhibitor of cell wall formation to generate protoplasts. 

1 19, The method of claim 10, further comprismg selecting or screening to isolate 

2 regenerated cells with hybrid genomes free from cells with parental genomes. 

1 20. The method of daim 1 0, wherein a first subpopulation of cells contam a first 

2 maricer and the second subpopulation of cdls contain a second marker, and the method fiirther 

3 comprising selecting or screening to identify regenerated cells expressing both the first and second 

4 marker. 

1 21. The method of claim 1 0, wfa^dn the first marker is a membrane marker and 

2 the second marker is a gmetic marker. 

1 22. The method of claim 10, wherein the first marker is a first subunit of a 

2 heteromeric enzyme and the second marker is a second subunit of the heteromeric enzyme. 

1 23. The method ofdaim 10, further comprising transfonoiiiig protoplasts with a 

2 library of DNA fragments in at least one cycle. 

1 24. The method of claim 23, vdiereui the DNA fragments are accompanied by a 

2 restriction enzyme. 
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1 25. The method of ctaim 10, further comprising exposing the protoplasts to 

2 ultraviolet irradiation m at least one c^cle. 

1 26. The method of claim 10, wherein the desired property is the expression of a 

2 protein, primary metabolite, or secondary metabolite. 

1 27. The method of claim 1 0, wherein the desired property is the secretion of a 

2 protein or secondary metabolite. 

1 28. The method of daim 27, wherein the secondary metabolite is selected from 

2 taxol, cyclosporin A, and erythromydn. 

1 29. The method of claim 1 0, wherdn the desired property is capacity for meiosis. 

1 30. The method of claim 10, wherein the desired property is compatibility to form 

2 a heterokaryon with another strain. 

1 31. The method of claim 10, fiirth^ comprising exposing the protoplasts or 

2 mycelia to a mutagenic agent in at least one qrcle. 

1 32. A method for whole genome shuffling through organized heteroduplex 

2 shufiSing, the method comprising: 

3 (a), providing chromosomal DNA of an organism which is targeted for sHnfFlmg ^ 

4 digesting the chromosomal DNA with one or more restriction enzymes, ligating the chromosomal 

5 DNA into a cosmid, the cosmid comprising at least two rare restriction c^n^me recognition sites, 

6 aliquoting, purifying, and storing sufi5cient cosmids to represent a complete chromosome; 

7 (b). mutageni:dngaliquotsofthe library in vitro using a mutagen; 

8 (c). transfecting a sample from a plurality of the mutagenized aliquots into a population of 

9 target cells; 

10 (d). assaying resulting transfectants for phenotypic improvements; 

1 1 (e). growing transfected cells harboring a mutant library of the identified cosmid(s) on 

12 media and screenmg the resulting cell colonies for independent mutants, confining an desired 

13 phenotype; 

149 



wo 00/04190 PCT/US99/15972 

14 (f). isolating and pooling DNA from cells identified in the screening; ^ 

1 5 (g). dividing the selected pools and digesting at least one sample with a rare-cutting 

16 restriction enzyme, pooling the cleaved samples, denaturing the samples, reannealing the samples 

1 7 and religating the samples; and, 

1 S (b). transfecting target cells with the resulting heteroduplexes and propagating the ceQs to 

1 9 allow recombination to occur between the strands of the heteroduplexes in vivo. 

1 33, The method of claim 32, further comprising additionally screening the 

2 transfectants. 

1 34, The method of daim 32, fixrther comprising fiirther shuffling the 

2 heteroduplexes by recursive in vitro heteroduplex formation and in vivo recombination prior to 

3 additionally screening the transfectants. 

1 35. The method of claim 33, further comprising performing an additional 

2 mutagenesis step to increase diversity during the shufiBing process. 

1 36. The method of claim 32, further comprising combmmg one or more 

2 heteroduplexes into a host chromosome by chromosome integration. 

1 37. The method of clafan 36, fiuther comprising repeating steps (a).-(h)., using 

2 the organism resulting from chromosome integration as the source for chromosomal DNA in step 

3 (a). 

1 38. The method of claim 32, wherein the cosmid comprises restriction sites for 

2 SfrorNotl. 

1 39, The method of claim 32, v^^ein the transfectants are assayed as a pool from 

2 each mutagenized aliquot. 

1 40. The method of claim 32, wherein a positive assay result indicates that a 

2 cosmid from a particular aliquot can confer phenotypic improvements and contains large genomic 

3 fragments that are suitable taigets for heteroduplex mediated shuflBing. 

I 41. The method of claim 32, wherein the mutagen is a chemical mutagen. 
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42« The method of claim 32, wherein growing transfected cells haiboring a 
mutant library of the identified cosmid(5) on media conq>rises plating the transfected cells on 
solid media. 
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0 



"■■ftGACGCCAGAGAAGCCTGTCGGCACGGT 

CG6ATTTTg'<?TCATGACATTATCAAAAAG%GCCGC6GCCTAAGA6CCC%AGAACCCTG^^ 7 

si: iiisi I ::::::::::::::::::::::: 11 

Clono 6 - AGAGGCCAGAGAAGCCACTTGGCACGGT 28 

eeaplQce 13 G — AGGCCAGAGAAGCCTGTCCCCTTGGT 21 



New Minshali 
Hew Clone 2 



CTGGTTTGCTTTT GCCACTGCCCGCGGTGAAGGCATTACCCGGCGGGAATGCTTCAGCGGCGACCGTGAT 
80 90 100 110 120 lio 

Mew MLnshali CTGGTTTGCTTTTGCCACTGCCCGCGGTGAACCCATTACCCGGCGGGA-TGCTTCAGCGGCGArrnTRaT mo 
New Clone 2 CTGGCTTGCTTTTGCCACTGCCCGCCGTGAAGGCATTACCCGGCGGGAATGC^ I7 

New Clone 6 CTGGTTTGCTTTTGCCACTGCCCGGGGTCAGGGCATTACCCGGCG6GAATGCTTCAGCGGCGACCGTGAT 9B 
eompiace I : CTGGTTTGCTTTTACCATTGCCCGCGGTGAACGCATTACCCGGCGGGAATCCTTCAGCGCCCACC^ 9? 



New Minshall 
New Clone 2 
New Clone 4 
New Clone 5 
New Clone 6 
compute 13 



GCGGTecSTrGTCACGri'Ar 



iici 

15C 



SGCAAC. 



pTTTCTACAj^AACACCTGAJ' 



TftCTgggTftTfiCpTTCPftqftCCyTgTgg 

160 170 leO 190 3Q0 ^io 

GCGGTCCGTCGTCAGGCTACTGCGTATGCATTGCAGACCTTGTGGCA^ 
GCGCTGCGTCCTCACGCTACtGCGTATCCATTGCAGACCTTCTGGCAACAATTTCTA^ 152 
GCGGTGCGTCGTCAGGCTACr Jgg 

GCGGTGCGTCGTCAGGCTACTGCGTATGCATTGCAGACCITGTGGCAACAATTTCTACAAjKca^ 145 

gcgctgcctcgtcaggctactgcgtatgcactgcagaccttgtggcaacaatttctacaajScacct^^ I6B 
gcggtgcgtcgtcacgctactgtctatccactgcagaccttgtggcaaccatttctacaaaacactcga^ ill 



New HlnshAll 
New Clone 2 
New Clone 4 
New Clone S 
New Clone 6 
coRiplece 13 



RCTGTATGfl^CATACAGTAJAATTGCTTC^ACAGAACATATTGACTATCCGGTATTAPPPfinrHTfifir^ftrp 
220 230 340 250 260 270 280 

Jti?i?j[?iXg§S5SS?SJ??ii??gSSSSgSigT*?J?5gJli?S^^^ 



GACTAAAAATGGCTATTGACGAAAACAAACAGAAAGCGTTGGCGGCAGCACTGGGCCAGATTGAGAAACA 
290 ajo 3^0 320 330 340 3^0 

...... . gg?§S5SS?ggg?S?§gjl§i«SSiSilJ§iiSSSggg??ggggggjJggJtg?gggggjtgJ??gjtgi^^ m 



New Mlnahail 
New Clone 3 



New Clone b 
New Clone 6 



r5r5JAJ^ISS£2?nSJSSM?^C*^C*^*A^CCGTTGGCGGCAGCACTGGGC^ 



coaplece 13 GAGTAAAAAtGGCtAttGACCAAAACAAACAGAAAC 30? 



ftTTTfifiTftAj\fifirTrrATn{\TfirnrrTfif^pGAAfiArrfijTrrATfir,ftTpc:caA&rriiTri»f>.r«r.rp>^j 

3*0 370 380 390 400 410 gin 

New Mlnahal! ATTTCGTAAAGGCTCCATCATGCGCCTGGGTCAAGACCGTTCCATGGATCTCGAAACCATCTCTACCfiCT A\Q 

New Cloo* : ATTTGGTAAAGCCTCCATCATGCGCCTGGGTGAAGACCGTTCCATGGATGTGGAAACCA^ Ul 

New Clone 4 f TTTGCTAAAGGCTCCATCATCCGCCTGGGTCAACACCGTTCCATCGATGTCCAAACCATCTC^ 378 

New Clor.e i ATTTGGTAAAGGCTCCATCATGCGCCTCGGTGAAGACCGTTCCATCGATGTGCAAACCATCTCTACCG^ ill 

New Clone 6 ATTTCGTAAAGGCTCCATCATGCCCCTGGGTGAAGACCGTTCCATGCATCTGCAAACCATCTCTACT^ 111 

COttplete 1 2 GTTTGGTAAAGGCTCCATCATCCGCCTGCGGGAAGACCGTTCCATGGATGTGGAAACCAtSc^ 37 7 
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13A1 



TCGCTTTCRCTGGATATCCCCCTTCCGGCAGGTGGTCTGCCGATGGGCCGTATCGTCGAAATCTACGGAC 

430 440 4S0 4 60 470 480 490 

N*w Mlnsha 1 1 TCGCTTTCACTCCATATCGCGCTTG6GGCAGGTGGTCTGCCGATCGGCCGTATCGTCGAAATCTACGGAC 
NQw Clone 2 TCGCTTTCACTGGATATCGCCCTTCGGGCAGGTGGTCTGCCGATGGGCCGTATCGTCGAAATCTACGGAC 
Mew Cione 4 TCGCTTTCACTGGATATCGCACTTGGGGCAGGTGGTCTGCCGATGGGCCGTATCGTCGAAATCTACGGAC 

K«M Clone s tcgctttcactggatatcgcgcttggggcaggtggtctSccgatgggccgtatcgtcgaaatctacggac 

New Clone 6 TCGCTTTCACTGGATATCGCGCTTGGGGCACGTGGTCTGCCGATCGGCCGTATCGTCCAAATCTATGGAC 
eoaplece 13 TCGCTTTCACTGGATATCGCGCTTGGGGCAGGT6GTCT6CC6AT6CGCCGTATCGTC6AAATCTACGGAC 447 



4B9 
432 

m 

448 



Hew HiAshall 
Hew Cione 2 
New Clone 4 
New Clon« b 
New Clone 6 
complete 13 



CGGAATCTTCCGGTAAAACCACGCTGACGCTGCAGGTGATCGCCGCAGCGCAGCGTGAAGCTAAAACCTG 

SOO »10 530 530 S40 ^iO S60 

CGGAATCTTCCGGTAAAACCACGCTGACGCTGCAGGTGATCCCCGCAGCGCAGCGTGAAGGTAAAACCTG 559 

CGGAATCTTCCGGTAAAACCACACTGACCCTGCAGGTCATCGCCGCAGCCCAGCGTGAAGGTAAAACCTG 502 

SSSAATCTTCCGGTAAAACCACGCTGACGCTG 518 

CGGAATCTTCCGGT AAAACC ACACTG ACCCTGCAGGTGATCGCCGCAGCGCAGCGTGAAGCT AAAACCTG 499 

C6GAATCTTCCGGTAAAACCACACTGACGCTGCA6GTGATCGCCGCAGCGCAGCGTGAGGGTAAAACCTG 518 

CGGAATCTTCCGGT AAAACCACGCTGACGCTGCAGGTGATCGCC6CAGCGCAGCGTGAAGGT AAAACCTG 517 



T-GCGTTTA^CGATGCTGAACACGCGCTGGACCCAATCTACGCACGTAAACTGGGCGTCGATATCGACAA 

570 5B0 590 600 610 620 630 

New Hlnshall T-GCGTTTATCGATGCTGAACACGCGCTGGACCCAATCTACGCAC6TAAACTGGGCGTCGATATCGACAA 628 

New Clone 2 T*GCGTTTATCGATGCCGAACACGCGCTGGACCCAATCTACGCACGCAAACTGGGCGTCGATATCGACAA 571 

New Clone i T-CCGTTTATCGATGCTGAACACGCGCTGGACCCAATCTACGCACGTAAACTGGGCGTCGATATCGACAA 58 7 

New Clone 5 TT6CGTTTATCGATGCT6AACACGCGCTAGACCCAATCTACGCAC6TAAACTGGGCGTCGATATCGACAA §69 

New Clone 6 T-GCGTTTATCGATGCTGAACACGCGCTGGACCCAATCTACGCACGTAAACT6GGCGTCGATATCGACAA 987 

cosipXete 1 3 T-GCGTTTATCGATGCTGAACACGCGCTGGACCCGATCTACGCACGTAAACTGGGCGTCGATATC6ACAA 58 6 



CCTCCTGTCyTCCCAeCCGCACACCCCCC^CCAGGCACTyeilAATCTGTyACCCeCTGGCCeeTTCTCCy 

640 650 660 670 680 690 700 

New Hlnshall CCTGCTGTGCTCCCAGCCGGACACCGCCGAGCAGGCACTGGAAATCTGTGACGCCCTGGCGCGTTCTGGC 698 

New Clone 2 CCTGCTGTGCTCCCAGCCGGACACCGGCGAGCA6GCACTGGAAATCTGTGACGCCCTGGCGCGTTCTGGC 64 I 

New Cione 4 CCTGCTGTGCTCCCAGCCCGACACCGGCGAGCAGGCACTGGAAATCTGTGACGCCCTGGCGCGTTCTGGC 6S7 

New Cione ^ CCTGCTGTGCTCCCAGCCGGACACCGGCGAGCAGGCACTGGAAATCTGTGACGCCCTG6CGCGTTCTCGC 63 9 

New Clone 6 CCTCCTGTGCTCCCAGCCGGACACCGGCGAGCAGGCACTGGAAATCTGTGACGCCCTGGCGCGTTCTGGC 651 

conplece 13 CCTGCTGTGCTCCCAGCCGGACACCGCCCAGCAGGCACTGGAAATCTGTGACGCCCTGGCGCCCTCTGGC 65 6 



GCAGTAGACCTTATCGTCGTTGACTCCGTCGCGGCACTGACGCCGAAAGCGGAAATCGAAGGCGAAATCG 

~ 710 720 730 740 750 tIo 770 

New Mlnsnall GCAGTAGACGTTATCCTCGTTGACTCCGTGGCGGCACTGACGCCGAAAGCGGAAATCGAAGGCGAAATCG 7 68 

New Clone 2 GCAGTAGACGTTATCGTCGTTGACTCCGTGGCGGCACTGACGCCGAAA6CG6AAATCGAAGGCGAAATCG 711 

New Cione 4 GCGGTAGACGTTATCGTCGTTGACTCCGTGGCGGCACTGACGCCGAAAGCGGAAATCGAAGGCGAAATCG 72 7 

New. Cione b GCAGTAGACGTTATCGTCCTTGACTCCGTAGCGGCACTGACGCCGAAAGCGGAAATCGAAGGCGAAATCG 709 

New Clone € GCTGTAGACGTTATCCTCGTTGACTCCGTGGCGGCACTGTCGCC6AAAGC6GAAATCGAAGGCGAAATCG 727 

conplece 13 CCAGTGGACGTTATCGTCCTTGACTCCGTGGCGGCACTGACGCCGAAAGCGCAAATCGAAGGCGAAATCG 72 6 



GCGArTrTrj^rATGGGgCTj'GgGnrArGTj^TnATfiRfirrAnnrfiRTGCGpAcrTnf:pn<^fiTAArrTnaji 

780 790 600 810 820 830 840 

New Hinshall GCGACTCTCACATGGGCCTTGCGGCACGTATGATGAGCCAGGCGATGCGTAAGCTGGCGGGTAACCTGAA 8 38 

New Clone 2 GCGACTCTCACATGGGCCTTGCGGCACGTATGATGAGCCAGGCGATGCGCAAGCTGGCGGGTAACCTGAA 76) 

New Clone 4 GCCACTCTCACATGGGCCTTGCGGCACGTATGATGAGCCAGGCGATCCGTAAGCTGGCGGGTAACCTGAA 797 

New Clone 5 6CGACTCTCACATGGGCCTTGCGGCACGTATGAT6AGCCAGGCGATGCGTAAGCTGGCGGGTAACCTGAA 77 9 

New Clone 6 GCGACTCTCACATGGGCCTTGCGGCACGTATGATGAGCCAGGCAATGCGTAACCTGGCGGGTAACCTGAA 797 

cooplete 1 3 GCGACTCTCACATGGGCCTTGCAGCACGTATGATGAGCCAGGC6ATGCGTAAGCTGCCGGGTAACCTGAA 796 



FIG. 12B 



wo 00/04190 



PCT/US99/15972 



CCAGTCCAACftCGCTGCTGATCTTCRTCAACCAGATCCGTATGAAAATTGGTGTGATGTTCGGTAACCCG 

1 1 1 1 1 1 r 

850 860 810 880 890 900 910^^^ 

New Minshall GCAGTCCAACACGCTGCTGATCTTCATCAACCAGATCCGTATGAAAATTGGTGTCATGTTCGGTAACCCG 908 

New Cione 2 GCAGTCCAACACGCTGCTGATCTTC ATTAACCAGATCCGT ATGAAAATTGCTGTGATGTTCGCTAACCCG B51 

New Clone 4 GCAGTCCAACACGCTCCTGATCTTCATCAACCAGATCCGTATGAAAATTGGTGTGATGTTC6GTAACCCG 867 

New Clone i GTT6TCCAACACGCTGCTGATCTTTATCAACCAGATCCGTATGAAAATTGGCGTGAT6TTCGGTAACCCG 84 9 

New Clone 6 GCAGTCCAACACGCTGCTGATCTTCATCAACCAGATCCGTATGAAAATTGGTGTGATGTTCGGTAACCCQ 8 67 

complete S3 GCAGTCCAACACGCTGCTGATCTTCATCAACCAGATCC6TATGAAAATT6GTCTGAT6TTCGGTAACCCG 866 



GAAACCACTACCGGTGGTAACGCGCTGAAATTCTACGCCTCTGTTCGTCTCGACATCCGTCGTATCGGCG 

1 1 1 1 1 \ r 

920 930 940 950 960 970 980 

New Mlnshall CAAACCACCACCGGTGGTAACGCGCTGAAATTCTACGCCTCTGTTCGTCTCGACATCCGTCGTATCGGCG 978 

New Cione 2 GAAACCACTACCGGTGGTAACGCGCTGAAATTCTACGCCTCCGTTCGTCTCCACATCCGTCGTATCGGCG 921 

New Clone 4 GAAACCACTACCGGTGGTAACGCGCTGAAATTCTACGCCTCTGTTCGTCTCGACATCCGTCGTATCGGCG 937 

New Clone S CAAACCACCACCGGTGGTAACGCGCTGAAATTCTACGCCTCTCTTCGTCTCCACATCCGTCGTATCCGCG 919 

New Clone 6 GAAACCACCACCGGTGGTAACGCGCTGAAATTCTACGCCTCTGTTCGTCTCGACATCCGTCGTATCGGCG 937 

conplece 13 GAAACCACTACCGCTGGTAACGCGCTGAAATTCTACGCCTCTGTTCGTCTCGACATCCGTCGTATCGGCG 93 6 



CGGTGAAAGj^GGGCGAAAACGTGGTGGGTAGCGAAACCeGCGTGAAAGT^GTGAAGAAC^aAATCCCTGp 

990 1000 1010 1020 1030 1040 lOSO 

New Mlnshall CGGTGAAAGAGGGCGAAAACGTGGTGGGTAGCGAAACCCGCGTGAAAGTGGTGAAGAACAAAATCGCTGC 104 8 

New Clone 2 CGGTGAAAGAGGGCGAAAACGTGGTGGGTAGCGAAACCCGCGTGAAAGTGGTGAAGAACAAAATCGCTGC 991 

New Clone < CGGTGAAAGAGGGCGAAAACGTGGTGGGTAGCGAAACCCCCGTGAAAGTGGTGAAGAACAAAATCGCTGC 1QQ7 

New' Clone 5 CGGTGAAAGAGGGCGAAAACGTCGTGGGTAGCGAAACCCGCGTGAAAGTGGTGAAGAACAAAATCGCTGC 989 

New Clone 6 CAGTGAAAGAGGGCGAAAACGTGGTGGGTAGCGAAACCCGCGTGAAAGTGGTGAAGAACAAAATCGCTGC 1007 

conplece 13 CGGTGAAAGAGGGCGAAAACGTGGTGGGTAGCGAAACCCGCGTGAAAGTGGTGAAGAACAAAATC6CTGC 1006 



CCCCTTTAA^CAGGCTGAAJTCCAGATCCTCTACGGCGAAGGTATCAACJTCTACGGCGAACTGGTTGAy 

1060 1O70 1080 1090 1100 1110 1120 

New Mlnshall GCCGTTTAAACAGGCTGAATTCCACATCCTCTACGGCGAACGTATCAACTTCTACGGCGAACTCGTTGAC 1118 

New Clone : GCCGTTTAAACAGGCTGAATTCCAGGTCCTCTACGGCGAAGGTATCAACTTCTACGGCGAACTGGTTGAC 10 61 

New Clone 4 GCCGTTTAAACAGGCTGAATTCCAAATCCTCTACGGCGAAGGTATCAACTTCTACGGCGAACTGGTTGAC 107 7 

New Clone 5 GCCGTTTAAACAGCCTGAATTCCAGATCCTCTACGGCCAACGTATCAACTTCTAC6GCGAACTGGTTGAC 1059 

New Clone C GCCGTTTAAACAGGCTGAATTCCAGATCCTCTACGGCGAAGGTATCAACTTCTACGGCGAACTGGTTGAC 1077 

coAplece 13 GCCGTTTAAACAGGCTGAATTCCAAATCCTCTACGACGAAGGTATCAACTTCTACGGCGAACTGGTTGAC 107 6 



New Mlnshall 
New Clone 2 
New Clone 4 
New Clone 5 
New Clone 6 
eoaplece 13 



CTGGCCGTAAAAGAGAAGCTGATCCAGAAAGCAGGCGCGTGGTACAGCTACAAAGGTGAGAAGATCGGTC 

1130 1140 1150 1160 U70 1180 iTio 

CTGGGCGTAAAAGAGAAGCTGATCGAGAAAGCAGGCGCGTGGTACAGCTACAAAGGTGAGAAGATCGGTC 1168 
CTGGGCGTAAAAGAGAAGCTGATCGAGAAAGCAGGCGCGTGGTACAGCTACAAAGGAGAGAAGATTGG.TC 1131 
CTGGGCGTAAAAGAGAAGCTGATCGAGAAAGCAGGCGCGTGGrACAGCTACAAAGGTGAGAAGATCGGTC 114 7 
CTGGGCGTAAAAGAGAAGC7GATCGAGAAA6CA6GCGCGTGGTACAGCTAC AAAGGTGAGAAGATCGGTC 1129 
CTCGGCGTAAAAGA6AAGCTGATCGA6AAAGCAGGCGCGTG6TACAGCTACAAA6GTGAGAA6GTTGGTC 1X47 
CTGGGCGTAAAAGAGAAGC7GATCGAGAAAGCAGGCGCGTGGTACAGCTACAAA6GTGAGAAGGCCCGTC 114 6 



New NlnahAU 
New Clone : 
New Clone 4 
Now Clone s 
New Clone 6 
coapieie 3 3 



ARCCTAAARpAATftrfiArjftrrTnfirTfi^aAnaTanpr^ftraaiirrftrpAaftaft&Tr^afia&fi&afcftj^ 

1200 1210 1220 1230 1240 13S0 1260 

AGGGTAAAGCGAATGCGACTGCCTGGCTGAAAGATAACCCGGAAACCGCCAAAGAGATCGAGAAGAAAGT 1258 
AGGGTAAAGCGAACGCGACTGCCTGCCTGAAAGATAATCCGGAAACCCCGAAAGAGATTGAGAAGAAAGT 1201 
AGGGTAAAGCGAATGCGACTGCCTGGCTGAAAGATAACCCCGAAACCCCGAAAGAGATCGAGAAGAAAGT 1217 
AGGGTAAAGCGAATGCGGCTGCCTGGCTGAAAGGTAACCCGGAAACCGCGA AAGAGATCGAGAAGAAAGT 1199 
AGGGTAAAGCGAATGCGACTGCCTGGCTGAAAGATAACCCGGAAACCGCGAAAGAGATCGAGAAGAAAGT 1217 
AGGGTAAAGCGAATGCGACTCCCTGGCTGAAAGATAACCCGGAAACCGCGAAAGAGATCGAGAAGAAAGT 1216 
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ACCTGAGTTGCTGCTCAGCAACCCGARCTCAACGCCGGATTTCTCTGTAGATGATAGCGAAGS CGTAGCA 

I I 1 t 1 I r 

1270 1260 1290 1300 1310 1320 1330 

New Hlnshall ACGTGAGTTGCTGCTGAGCAACCCGAACTCAACGCCGGATTTCTCTGTAGATGATAGCGAAGGCGTAGCA 1328 

New Clone 2 ACGTGAGTTGCTCCTCAGCAACCCGAACTCAACGCCGGATTTCTCTGGAGATGATACCGAACGCGTACCA 1271 

New Clone A ACGTGAGTTGCTGCTGAGTAACCCGAACTCAACGCCGGATTTCTCTGTAGATGATAGCGAAGGCGTAGCA 1287 

New Clone i ACGTGAGTTGCTGCTCAGCAACCCGAACTCAACGCCGGATTTCTCTAGAGATGATAGCGAAGGCGTAGCA 1269 

New Clone 6 ACGTGAGTTGCTGCTGAGCAACCCGAACTCAACGCCGGATTTCTCTGTAGATGATA6CGAAGGCGTAGCA 12B7 

conplece 13 ACGTGAGTTGC7GCTGAGCAACCCGAACTCAACGCCGGATTTCTCTGTAGATGATAGCGAAGGCGTAGCA 1266 



CAAACTAACGAAGATTTTTAATCGTCTT6TTT6ATACACAAGGGTCGCATCTGCCGCCCTTTTCCTTTTT 

1340 13S0 1360 1370 1380 1390 UOO 

Hew Hinsheil GAAACTAACGAAGATTTTTAATCGTCTTGTTTCATACACAACGGTCGCATCTGCGGCCCTTTTGCTTTTT 1398 

New Clone 2 GAAACTAACGAAGATTTTTAATCGTCTTGTTTGATACACAAGGGTCGCATCTGCGGCCCTTTTGCTTTTT 1341 

New Clone A GGAACTAACGAAGATTTTTAATCCTCTTGTTTGATACACAACGGTCGCATCTGCGGCCCTTTTGCTTTTT 1357 

New Clone b GAAACTAACGAAGATTTTTAATCGTCTTGTTTAATACACGAGGGTCGCATCTGCGGCCCTTTTCCTTTTT 1339 

New Clone 6 GAAACTAACGAAGATTTTTAATCSTCTTGTTTCATACACAAGGGTCGCATCTGCGGCCCTTTTCCTTTTT 1357 

eofflplece 13 GAAACTAACGAAGATTTTTAATCGTCTTGTTTGATACACAAGGGTCGCATCTGCGGCCCTTTTCCTTTTT 1356 



TAACTTeTA|^GGATATGee|^TgACAGAATCAACATgCCCJCXXXXXXXXyXXXXXXXXX|exXXX1CKXyX|e 

1410 1430 1430 1440 1450 1460 1410 

New Mlnshai; TAAGTTGTAAGGATATGCCATGACAGAATCAACATCCCGTCGGCCTGGTAGGCCATTTTTTGGATCTTCA 14 68 

Mew Clone 2 TAAGTTGTAAGGATATGCCATGACAGAATCAACATCCCGTC 1382 

New Clone 4 TAAGTTGTAGGGATATGCCATGACAGAATCAACATCCCGTCGCCCTCGTAGCCCATTTTTTCCATCTTCA 142 7 

New Clone i TAAGTTGTAAGGATATGCCATGACAGAATCAACATCCAGTC 1380 

New Clone 6 1343 

coDplece 13 TAAGTTGTAAGGATATGCCATGA 1379 



xxxxxxxxxxxxxxxxx 

1480 

New Hinshail CCTAGATCCTTTTAAAT - 14 85 

New Clone 2 1382 

New Clone i CCT 1430 

New Clone i 1380 

New Clone 6 1343 

sonplete U 1379 
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orig proc 
clone 2 prot 
clone 4 prot 
clone 5 proc 
clone proc 
clone 13 proc 



MTGVKMAI DEWKQRALAft^LCQIEKQrCKGSIHRIGEDRSMDVETISTGSLSLDIRLGAGGLPHGRIVEI 
MTrX5lJfT8SS5S5?f'?I?J'S2ii59CS5SSJ«RJ^GEDRSMDVETl^^ 70 

S;gX55JSJR525255V^5(^tS2JI52i:S5SsiMRLGEpRaM^^^^ 70 



orig proc 
clone 2 proc 
clone 4 proc 
clone 5 proc 
clone 6 proc 
clone 13 proc 



YGPESSGKTTLTLQVIA AAyRECKTCAriDAEHALDPIYARKLGVDIDHLLCSQPDTGEyALEICDALAR 



110 



120 



130 



140 



YGPESSGKTTtTLQVIAAAQREGKTCAFlOAEHALDPXyARKLCVDIDSLLCSOPDTSEOALEIcnALaR 



rCAFIOAEM, 



^ CDALAR 

XSSillS555}'55'SXiM?2&ISK^ 

YGPESSGKTTLTLQVIAAAORECKTCAFXDAEaALOPlYARKLGVOIOKLLCSOPOTGEQALEICOALAR 



YGPESSGKTTLTLOyiAAAQREi 
YGPESSGKTTLTLQVIAAA " 



R£§!(?CAFXDAERi 



140 
140 

m 

140 
140 



orlg proc 
clone ^ proc 
clone 4 proc 
clone S proc 
clone 6 proc 
clone 13 proc 



SGAVPVIVVpSVAALTPKAMEGEIGDSHMGLAARMMSOA MRKLAGNLKpSWTLLIFIWQIRMKIGVMFG 

e/*»t»»v.»*»rJl5Sl,.,»,- 190 200 21c 



l?JXSXJXXSlX?t^3S5?i5ISiJSSiSSSi'**^«wsQAMRKL^^ 

lr5XSXTXunlX??J'5II^SJ5SIJS2l55gH?^"SQAMRKLAGNLKOSNTLLIFINOIR^^ 210 

SGAyoyiyypsyAAj.||jj|||g||gg|g^^^ 21c 



SGAVOVIVVDSVAAL 



iSJXSXJXSSIX»Ji*JlifJ5ilSiJS2fS5Si'J*^«"SOAMRKLAGNLKOSMTX-LIFINOXRMKIGVMrG 210 
SGAVDVIVVDSVAALTPRAEXECEXGOSHMCLAARHMSQAMRKLAGNLKQSSTLLiriHQIRMKIGVMFG 210 



WPETTTCGHyifKFYftSVRLpXRRIGAVKEpWVVGSETRyKV VKNKIAAPFKOAEFOILYGEGIMFYGEL 

220 230 240 250 260 270 

erlq proc NPBTTTGGNALKFyASVRLDIRRXGAVREGEMVVGSETRVKVVKNKIAAPFKOAEFOTLYGEClHrYcrL 9Rn 

clone 2 proc NPBTTTGGNALKPYASVRLDXRRXGAVRBGENVVGSETRVRVVKNK^ 2B0 

clone 4 proc NPETTTGGNALKFYASVRLDXRRXGAVREGENVVGSETRVKVVKNKXA^^ ifiA 

clone i proc gPETTTCGNALKFYASVRLDIRRf GAVKEGENV^ jgg 

clone 6 proc NPETTTGGNALKFYASVRLDXRRXGAVKEGENVVGSETRVXVVKHKXAAPFROABrOZLYGEGINFYGEl! 280 

eleoe 13 proc NPETTTCGMALKFYASVRLDIRRXGTVKEGEMVVGSETRVKVVKNKIAAPFK^^ 280 



VDLCVKEKLXEKAGAWYSYKGEKIGQGKAHATAWLKOMPETAKEIEKKVRELLLSKPHSTPDFSVDDSEG 

290 sjo 3^0 320 330 340 alo 

orlg proc y0I«GVKERLIEXAGAtfY3YXGERXGQGKANATAHLK0NPETAXEXEKRVR6LLLSNPNSTPDFSVDDS£G 350 

Clone 3 proc VOLGVRERLXEXAGAKYSYXGEKXGOGRAHATAtfLRDNPETAKEIEKRVR^ 350 

Clone 5 proc VDLGVKEKLI£KAGAHySYRGEKX6QGKAHAAAWLKGMPSTAKEXEKRVRELLLSMPHSTPDFSRDDS£G aSfl 
clone 6 proc VOtGVKERLTEKAGAHYSYRGEXVGOGRANATAKLKDNPETAREIEKKVRELLtS^ 350 
clone 13 proc VDMGVKEKLIERAGAMYSYRGEKAGOGRAliATAWLKONPETAKEIEKRVRELLLsS^ 350 



VAETHKnr 

orlg proc VAETHEOF ,co 

New Clone ; VAETNEDF 

New Clone « VAGTHEDF itl 

New Clone » VAETHEOF ii% 

New Clone 6 VAETBEDF ill 

coopiece 13 VAETHEOF jfg 



FIG. 13 
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Few cycles of 
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point 
mutation 
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gene to be optimized 
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Homologous recombination 
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Chromosomal Copy 



Episome 
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■ 



"Cleaned" Chromosome 
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Fig. 22 
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High Throughput Family Shuffling 



Outside Primer 



Outside Primer 



E. CO// DnaJ 



\ 



Anneal 



Salmonella 
DnaJ Homolog 
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Priming Site 



^ PGR 

^ Family Sliuffle 



Fig. 24 
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Accumulation of Mutations by Sequential Mutagenesis, 
Pairwise Recombination, and Poolwise Recombination 




0 12 3 



Cycles 



Fig. 25B 
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Protoplast Formation 



mutation 




cell wail 



Protoplast Fusion 




Fig. 26 
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